Data-Driven Growth - Course Notes
Data-Driven Growth - Course Notes
Data-Driven Growth - Course Notes
Davis Balaba
Data-driven Growth
365 DATA SCENCE 2
Table of Contents
1. Introduction .................................................................................................................... 4
2.1 The stages of data maturity and what you will see next .................................. 5
Abstract
The growth mindset manifests itself in a culture of discontent of the current state.
The underlying assumption is that there is always more value to be uncovered and
What if our business does not have as much data as a large tech company?
2. You don’t realize how much data you have until you start focusing on it
The course provides actionable advice on how to get started on your data journey.
1. Ensures that you always work on the most important problems first
1. Introduction
then we will see higher than average ROI compared to other forms of marketing”)
a) Value statement
b) Measurable outcome
- Competition
- Expertise
- Ease of adoption
III. Testing
Test fast and learn faster
can start to see how much higher incremental impact the growth mindset has
Yes, it does. Any vertical in which you can have multiple challenges and possible
365 DATA SCENCE 5
2.1 The stages of data maturity and what you will see next
Data maturity is defined by data availability and your ability to access and take
advantage of it.
• Stage 2 (intermediate) – small scale data; you can perform analyses; report
End-to-end functionality to manage the data and innovate from your data-
driven findings.
The higher your data maturity level, the more hypotheses you can test. The more
When you have no data, there are a variety of techniques that can be used to get
insights. It all boils down to asking your customers in some way, shape, or form.
365 DATA SCENCE 6
At this stage, you have some data but are not extracting much value from it.
For example:
You have data surrounding what people click on your website, but you are not
using the data to make any decisions about what products to focus on or who to
Actions taken at this stage can yield remarkable results because you have not
Often times the data offers clues what to focus on. You can use them to stack rank
• Opportunity size
• Cost
• Risk
Only by understanding your current state of affairs can objectively determine if the
effort and resources you put into activities have a good return on investment.
Keep in mind that when you have a few data points there is a tendency to skew
and not use truly representative data. So, this is important to consider when
making decisions.
Hypothesis:
We believe that by analysing the customer journey we can identify the weak points
in our funnel.
Focussing our actions on these weak points will drive incremental sales.
Customer journey:
1. See ad
2. Click ad
3. Visit site
5. Add to cart
6. Purchase
Before starting the analysis, it is important to understand the reason why you do it.
This enables you to identify a clear outcome you are aiming for.
This is important because, at first, when you get a hold of data you can go in many
different paths. But clearly only a few of them will lead you to an outcome, which is
If you set certain benchmarks before starting the analysis, you can start to
Incremental sales – revenue that would not have happened if we had not taken
certain actions
365 DATA SCENCE 8
Analysis plan:
based on that we can start hypothesizing where the biggest opportunities are.
One caveat of spreadsheet tools is that they have a limitation of the number of
A data dictionary defines every variable (every field) that you have in the data. It
tells you what it means and what values it is supposed to take. This might be useful
when starting your analysis, but it will be even more useful at a later stage when
after some time you or someone else comes back to the analysis to refresh or edit
it.
Data improvement plan – actionable steps that can be taken to improve data
availability and quality in the future when more resources are available
Typically, you will use the insights generated with the smaller dataset to justify
getting more investment to get more sophisticated tools and get more talented
Conversion rate: the percentage of targets that perform the action that you are
Example:
An ad is shown to 100 customers. 5 of them click on the ad. Then the ad’s
Drop-off rate: number of customers who did not perform the intended action.
95/100 = 95%
Visit site 2.63k 67% of site visitors didn’t click ‘buy now’
Click buy now 877 43% of ‘buy now’ clickers didn’t add anything to
the cart
Add to cart 497 31% of those who added to cart didn’t purchase
Purchase 344
The question to answer is: Where is the biggest opportunity for improvement, that
Beware of surgical targeting (having too narrow targeting that results in very high
CTR). The downside of this strategy is your basket of opportunity is very narrow.
You might be better off diluting the CTR but expanding to a larger audience.
Conclusion: There is a huge gap upside in increasing the number of people who
Hypothesis: We believe that we can double our reach by advertising more via
paid search and paid social. The incremental reach will result in incremental sales.
Risks:
Current margin per unit will decrease. Despite this, overall profit might increase if
The question that needs to be answered is – “Is our site optimized to sell?”
Observations:
• It is rendered as a bright-green button that sits at the top right of the site
• 40% of site visitors do not visit a page where we have ‘’Buy now’’
• Of the 2.63k site visitors, only 1.6k see the buy button
• Since 0.8k click, the true ‘’Buy now’’ conversion rate is about 56% i.e. the
Hypothesis: We believe that we can increase ‘buy now’ clicks by adding the button
to every page. This will result in incremental ‘’Buy now’’ clicks, which will lead to
incremental sales
additional 1k customers to it. These customers are likely lower intent so we can
assume a lower conversion rate for estimation. Let’s assume their ‘’Buy now’’ click
So 1k users with a ‘’Buy now’’ click rate of 10% means we can have an additional
incremental 42 purchases.
43% of ‘’buy now’’ clickers did not add anything to the cart.
Users who click ‘’Buy now’’ have shown some intent to buy. From our research cart
Possible reasons:
Observations:
• Items are sometimes not preserved in the cart when the user navigates
away
• 10% of users who have ‘add to cart’ end their session with an empty cart
Hypothesis: We believe we can improve our add to cart rate by introducing time-
sensitive discounts. This will lead to incremental add to carts and more sales.
We believe that re-engineering our site to preserve items added to cart will
reduce user frustration. This will lower cart abandonment and result in incremental
sales.
Risks:
• Some users many just have a longer more deliberative purchase journey.
• Adding new elements to the site could lead to new bugs and breaks in the
purchase flow.
• Define the effort needed to execute an idea and the impact (in terms of
• Pre/post analysis
• Time-series
• Diff-in-diff
• Causal inference
b. Challenging if user identifiers can evolve over time or can be hidden from
you
c. Challenging if you can’t cleanly separate test groups from control groups so
the effort to have more or higher quality data can unlock a disproportionate
365 DATA SCENCE 14
with data.
One of the first thing you can do when you get to data maturity level 2 is collect
more data. A very important question here is ‘What is the most valuable data to
collect?’ ‘What data can you collect that will help you answer strategic questions?’
Visits website
-missing step-
Checkout
Fallacy of complexity or Occam’s razor: Denies the assumption that the more complex your
Proof of concept:
• Why does your role or team exist and how does your role contribute to key
company goals?
• What are the different ways in which your role/team can achieve this?
• Which way is most likely to have the largest impact on what your role/team
is trying to achieve this?
• How will you validate or invalidate your hypotheses?
365 DATA SCENCE 15
At this level, you have a data infrastructure that gives you substantial data with
sufficient accuracy.
Ideally, you have significant information about the people who click on your
website. In essence, you have started to explain why things are happening using
data.
You can start to define quantitative metrics and have clear goals and measures of
Lagging metric – your ultimate goal but not one that you can directly work on
Leading metric – a goal that you can work towards directly, which impacts the
Measuring your progress towards your goal is important because you can:
2. make sure your metric is not changing due to something that you didn’t do
3. make sure people get credit for their impact on the metric
3 stages you can work towards to establish a causal relationship between your
program and one that is not; for ex. you have two cohorts and offer a
promotion to only one of them to see the impact of the promotion; The risk
differences between the two groups because after all they are different
groups of people; A way to decrease this risk is to measure the metrics prior
• Run – randomized control trial or A/B testing; You are able to randomize the
control and test group; In this way, if there is a difference between your
groups, you are much more confident that this is due to your treatment and
2. You must be able to randomize your target userbase and apply the same
treatment
3. You need to be able to have a big enough sample size for each of your different
Novelty effect – when consumers respond to new things at a much higher rate than
2. Instability
higher revenue at first, but since people will stock on courses, customers might
not purchase additional materials for some time after the sale.
4. Occurrence of outliers
It is always important to inspect your data for data points that can skew it left or
right.
5. Seasonality
Predictable changes in your metric like the day of the week, time of the day, etc.
If you take multiple different samples of your control group chances are that at
Business problem:
To save the product line we must demonstrate that we can significantly increase
We believe that emailing our customer base with information about the new toys
• Crawl stage
• Walk stage
• Run stage
Necessary skills:
Budget:
• No formal budget
You make your impact assessment by comparing how things were before you
made a change (pre-period) to how things are after the change (post-period).
Risks:
• All other factors must remain the same – very hard to achieve
Quantifying impact:
Wait until your effect stabilizes (novelty effect needs to wear off)
Measure the difference between the original run-rate and the current run-rate
For example:
Necessary skills:
You have access to resources that can do some fundamental analytics and
Infrastructure:
marketing tactic
365 DATA SCENCE 20
understand behaviour
Budget:
impact
Quantifying impact:
• You measure the change in your target metric after you rollout your tactic
• You can make a statement like “the difference in average daily sales
between the test and control groups was 500 before we revamped our
• You employ some crawl stage methods for areas that are yet to have split
testing capabilities
Diff-in-diff: Measure the difference between a test and a control group and see if
it is statistically significant
Necessary skills:
You have access to resources that can do some fundamental analytics and
advanced modelling
Infrastructure:
marketing tactic
• You have robust in-house tools or strong integration into 3rd party tools to
• Your test velocity is high and you can run multiple tests in parallel due to
Budget:
You have shown strong wins by leveraging data and as such you have solid
Quantifying impact:
• You mainly use methods that confirm that action A leads to action B
• You may still have cases where rigorous measurement is not possible
between an action you take and an outcome that can be observed and measured.
hypothesis.
365 DATA SCENCE 22
This metric quantifies your test’s ability to detect an outcome when the outcome
exists.
Typically set at 80%, this means that in 80 out of 100 tests where the effect exists,
In the crawl and walk stages, power is typically looked at as something ‘nice-to-
have’ since what we’re mostly looking for are directional signals about the impact
of our tactics.
All else being equal, the larger the sample size, the more powerful will be your
test.
Example: We observe around 38% lift in purchase rate, but it is not statistically
significant at the 95% level. We would want to explore a follow up test where we
Once you have demonstrated the value and potential impact of testing, you have
accelerating testing.
Armed with your current data and knowledge, find the next biggest initiative to
invest in.
auditing and validating data, capturing events and transactions that may be
missing
Without good data drawing insights from the data is much harder and the results
may be misleading.
you to run experiments more quickly and efficiently. Work towards holding
Holdouts: take a certain portion of the entire population and make sure they
experience no treatments; then you go ahead with your project and apply
Running experiments in parallel – make sure that the control and test groups are
The framework is the same as the one discussed at Level 1. You want to:
Since now you are able to quantify your results on your metrics, you can report
these obvious increases in the metrics and report statistically significant results.
• data talent
• engineering resources
• data infrastructure
• automation
For example:
You discover that people who spend more than 10 minutes on your website
As a rule of thumb, investing in more robust datasets pays off. You are able to
these attributes. You can even hire people to do machine learning, so you can do
know who is likely to buy what product given what kind of treatment and scenarios
there are. For example, you will know that if you give people $10 discounts you
will increase profits more than if you give them $15 or $8 discounts.
• age
• gender
• location
All of this depends on the legal landscape in your country. Make sure you stay true
Rule of thumb: If you don’t want other people tracking something about you,
chances are your customers would not want you to track them either.
From a growth perspective, at this stage you have likely run out of low risk – high
• Trustworthy data
Ideally, you would have checkpoints in place to make sure that new data satisfies
specific quality standards. You also have a team of engineers and data people
Automation
• Experimental framework:
Parallel experimentation:
• Randomization
• Parallel experimentation
• Population targeting
• Sample sizes
Not all areas of the company are automated at stage 3 data maturity. Using your
data team’s experience and accomplishments, you can help scale your solutions
across other areas of the company. For example, scaling your automation efforts
for providing discounts to maximize revenue for other products and services. Or
you can build template dashboards for people across the company to visualize
At this level, you also likely have the talent and robust data to explore advanced
order relationships. For example, you can use such algorithms for:
• Sales forecasting
• Anomaly detection
• Targeting models
• Recommender models
maturity.
365 DATA SCENCE 27
There is not a clearly defined higher data maturity level that is widely adopted in
industry.
3. Identify new stage 0 opportunities in your business and restart the process with
customer is interested in