Int. J. Indian Culture and Business Management, Vol. 21, No. 1, 2020
Marketing channel attribution modelling: Markov
chain analysis
Kunal Mehta
Manager Data Analytics, Marketing,
Publicis Sapient,
Gurugram, India
Email:
[email protected]
Ekta Singhal*
Faculty of Marketing,
Fortune Institute of International Business,
New Delhi, India
Email:
[email protected]
*Corresponding author
Abstract: With the advent of digital era the business landscape has evolved
drastically thereby impacting all the marketing and advertising activities.
Advertisers employ multiple channels to reach the customers on digital
platform. Now the challenge has come up to design methodology to attribute
conversions to these multiple channels in order to measure ROI (return on
investment) and optimise the allocation of media budget. The problem gets
compounded on digital platform where people tend to visit multiple times
through multiple channels before each conversion. Conventional models of first
touch, last touch and linear attribution do not give statistically complete picture,
but at the same time, there are not enough resources outside which helps to
implement a model like Markov attribution model to get statistically sound
attribution and analysis of conversions. This paper aims to provide a high-level
overview of different attribution models provided within some of the most
prominent tools like Adobe Analytics and Google Analytics. At the same time
the paper builds the case for more statistically sound model like ‘Markov
analysis’ to showcase how and why it is better than traditional models.
Keywords: marketing strategies; marketing channel; digital platform; channel
attribution model; Markov analysis; web analytics.
Reference to this paper should be made as follows: Mehta, K. and Singhal, E.
(2020) ‘Marketing channel attribution modelling: Markov chain analysis’,
Int. J. Indian Culture and Business Management, Vol. 21, No. 1, pp.63–77.
Biographical notes: Kunal Mehta has worked as a core member of analytics
CoE teams to ideate, innovate, develop and implement integrated analytics
solution to address business problems ranging from attrition rate, to cross sell,
to design and execute marketing strategies. He has also worked in areas of
reporting to develop, measure, analyze, derive insights and action on the data
driven information presented in visually enriching technology solutions like
Tableau, Qlikview, Power BI etc. Currently he is working with Publicis
Sapient, as one of the data leads for UK geography, helping clients from
diverse industries like Retail, Hospitality, Education, Automobile, Pharma,
BFSi etc. to extract maximum value from data to solve their business problems.
Copyright © 2020 Inderscience Enterprises Ltd.
63
64
K. Mehta and E. Singhal
Ekta Singhal has an excellent academic record with a great aptitude towards
teaching and research. She has obtained her PhD in Marketing Management
from the Department of Commerce and Business Administration (MONIRBA),
University of Allahabad, Master in Business Administration with Marketing
specialisation; and Bachelor in Commerce from Hansraj College, Delhi
University. She has been awarded with various academic scholarships. She has
presented her research papers in various conferences held all over India. She is
an avid reader and is keenly aware of new and emerging issues in Marketing
and Branding.
1
Introduction
The rapid expansion of communication technologies has significantly increased the
opportunity for customers to engage with brands whenever and wherever they choose
(Rangaswamy and van Bruggen, 2005). More number of people has access to mobile
phones and internet thereby leading to widespread usage of digital across various sectors.
As more number of people started making their purchase decisions online, marketers
started adopting digital advertising strategies. India’s digital advertisement market is
expected to grow at a compound annual growth rate (CAGR) of 33.5% to cross the
Rs 25,500 crore (US$ 3.8 billion) mark by 2020. This growth has created gamut of
opportunities for marketers to reach their audience in innovative ways like search ads,
display ads, and social media.
One consequence of increasingly diverse array of customer touchpoints is the need to
seamlessly integrate communication strategies across various channels (Neslin and
Shankar, 2009). This need has been magnified in recent years because of the increased
ability of consumers to select the channels they use (Bell et al., 2014). Increasing
customer touchpoints and digital channels has seriously complicated the process of
measuring the degree to which each channel contributes to conversions. This raises the
key question of attribution: Which particular ad gets credit for a conversion and how
much credit does each of these ads get? This is one of the most important questions
facing the marketers today.
Multichannel is considered as the design, deployment, coordination, and evaluation of
the different channels through which the firms interact with their customers thereby
aiming to increase customer value (Neslin et al., 2006). It focuses on handling and
enhancing the performance of each channel (Ailawadi and Farris, 2017). In order to
improve interactions with customers the key starting point is to understand what those
interactions are and where they take place. Without that understanding it would be
impossible to measure any improvements or to see if changes made to those interactions
were having a detrimental effect or positive effect. In this whole customer digital journey,
every channel, campaign, experience has a role to play, just like in football match, where
no one player decides the fate of the game, similar to that, in digital journey, not one
channel can be attributed for the whole conversion. This is why we have multiple
attribution models and methodologies that help us divide the credit of conversion to each
of the touchpoint channels in one way or the other. Some models are more scientific than
others, but there is no universal out of box model. The study aims to showcase the
relevance of data driven model, and in particular Markov Attribution Algorithm. The
readers should be able to choose the right attribution approach and learn how to
Marketing channel attribution modelling
65
implement Markov model on the data extracted from any web analytics tool like Google
Analytics or Adobe Analytics, implemented on their digital platform like desktop
website, mobile site and/or mobile app.
2
Background of the study
With the advent of digital era the business landscape has evolved drastically thereby
impacting all the marketing and advertising activities. Technology has remarkably
changed the market place leading to the birth of various channels of interaction between
customers and companies. There is an avalanche of information in the form of data
that is easily available for marketers to use (Rana, 2019). Digital media consumption
has increased tremendously through desktop, laptop, mobile and tablet interactions
(Necsulescu, 2015). Advertisers employ multiple channels to reach the customers
on digital platform (Anderl et al., 2014). This has led to a substantial rise in
online advertising as a crucial tool in the promotion mix of many industries
(Raman et al., 2012).
Automated collection of the enormous data sets from customers’ use of the internet
creates an opportunity for marketers and analysts to target their marketing campaigns
better and optimal utilisation of marketing budget (Goldfarb and Tucker, 2011).
Digital advertising media includes display, e-mail, search, social media, affiliates, mobile
and app (Shao and Li, 2011). Also, there are instances when customers visit
the company’s website on their own commonly called as organic search (Li and
Kannan, 2014). Marketers use numerous channels to reach customers in their customer
journey (Anderl et al., 2016). This poses a serious dilemma of attribution to the marketers
(Anderl et al., 2014; Dalessandro et al., 2012; Jordan et al., 2011d; Kitts et al., 2010; Lee,
2010; Lewis et al., 2011; Wiesel et al., 2011). For most of the traditional advertising
media there is no direct way to attribute sales to advertising. However, in case of digital
advertising both the delivery and interactions can be tracked (Sinha et al., 2014).
Marketers have been able to identify the inadequacies in the current attribution
methodologies (Chandler-Pepelnjak, 2009). And as a result, they acknowledge that
measurement of effectiveness of a particular advertising campaign and optimum
allocation of advertising budget was and still remains a very complicated task. However,
the digital media has made the task easier for marketing practitioners by connecting ad
impressions to user actions, interactions such as a search query, clicking on an ad or
converting (Agichtein et al., 2006).
In digital advertising, dilemma of attribution is the problem of assigning credit to one
or more advertisements for leading to a conversion. Many attribution models have been
developed to address this issue. Simple heuristics (last-touch attribution) model is the
simplest but full or errors. In this model, credit is assigned to the last ad/channel which
user interacts with preceding the conversion. It causes ads that appear much earlier in the
customer journey to receive less credit and ads that occur closer to the conversion to
receive most of the credit for the desirable action thereby causing incorrect attribution
and leading to sub optimal advertising budget allocation. De Haan et al. (2016) found that
content-integrated advertising is the most effective form and last-touch attribution
underestimates content-integrated activities.
Multi-touch attribution model allows more than one ad to get credit for conversion
based on their contribution. Kannan (2017) highlighted that the attribution models can
66
K. Mehta and E. Singhal
facilitate in providing insights for allocating marketing investments across multiple
channels. Osur et al. (2012) states that the two most important objectives of attribution
models, i.e., measurement of value and performance of digital channels and measurement
of impact of one digital channel on the performance of another channel. Multi-touch
attribution is one of the crucial problems of digital advertising and is gaining importance
both in practice and academia (Berman, 2018). Previous studies have proposed numerous
analytical attribution frameworks, like simple probabilistic model and bagged logistic
regression model (Shao and Li, 2011), individual-level probabilistic model (Li and
Kannan, 2014), mutually exciting point process models (Xu et al., 2014), game
theoretical model (Berman, 2018 Dalessandro et al., 2012), multivariate time series
model (Kireyev et al., 2016), and hidden Markov model (HMM). Rossi et al. (2003)
introduced joint stochastic modelling of losses and delays in the HMM description of real
channels as HMM is able to automatically capture various hidden states the dynamics of
network congestion status. Netzer et al. (2008) developed HMM to capture the dynamics
of customer relationships incorporating the effect of multiple customer touchpoints on the
subsequent buying behaviour. HMMs have been helpful in studying physicians’
prescriptive behaviour (Montoya et al., 2010), and online viewing behaviour (Schwartz et
al., 2011). Danaher and van Heerde (2018) proposed an attribution formulation which
shows that attribution is proportional to the marginal effectiveness of a medium times its
number of exposures.
Marketers need to develop and apply an attribution model which can help in
ascertaining the effectiveness of individual marketing channels and the interaction of
channels in a multichannel environment across various data sets and situations.
3
Attribution models and methodologies
In the age of proliferation of digital media, marketers communicate with customers
through various touchpoints in order to generate leads and conversions. In order to
understand such complex communications, analytics and advanced tools come in to play
(Necsulescu, 2015). To analyse, leverage and respond to multi-channel touchpoints,
attribution models need to make necessary interpretation of data. An attribution model is
the rule, or set of rules, that determines how credit for sales and conversions is assigned
to touchpoints in conversion paths.
To understand the most common attribution methodologies and process, let us
consider a journey of a customer as highlighted in the Figure 1 below. A user comes on
your website from a ‘display ad’, then through ‘natural search’, then through ‘paid
search’ and finally through ‘direct’ channel before purchasing something for $100.
Figure 1
A typical digital journey (see online version for colours)
Marketing channel attribution modelling
67
Some of the most widely used attribution models are:
1
Last touch: where all credit of the conversion is attributed to the last channel, in
Figure 1, to the ‘direct’. It is by far the most widely used attribution model.
2
First touch: where all credit of the conversion is attributed to the first channel, in
above figure, to the ‘display ad’. Widely used by organisations for whom customer
acquisition is paramount. Most of the start-up organisations rely on this in their early
phases.
3
Linear: in today’s digital world, the average transaction can have more than
25 touchpoints. The linear attribution model distributes credit of the conversion in
equal proportion to all the channels which participated in the visitor journey, in this
case to all touchpoints, and all channels; so if there are four touchpoints leading to a
conversion, each touchpoint gets 25% of total attribution. This is the ‘safe’ option of
attribution models which simply give equal credit to each of the touchpoints.
4
Last touch non-direct: if there is a case where last touch is ‘direct’, the credit of the
conversion is passed on to the channel before ‘direct’. In Figure 1, it will be passed
on to ‘paid search’ instead. In most of the prominent web analytics tools like Adobe
and Google, default attribution is always last touch, non-direct. Since most of the
direct traffic is a result of their interaction with our platform after coming through
various other channels, this attribution model makes most sense to the marketing
teams.
5
Time decay: in the time decay attribution model, attribution is measures in
accordance to when the visitor interacted with it. In this methodology last touch
channel gets the maximum credit, then the second last channel will get lesser than
the last, third last will get lesser than second last channel, so on and so forth. In this
case first touch gets least credit and the touchpoints closer to conversion are seen as
more valuable. For example, if there are three touchpoints in this order: YouTube,
Twitter, and Google. Google, being the last touch, would receive maximum credit,
i.e., 50% of the attribution, Twitter 30%, and YouTube 20%. This can be good
model for organisations which heavily depend upon promotional offers to acquire
and convert customers.
6
Position-based: the position-based attribution model combines linear and time-decay
models. In this model 40% of the credit of each of the conversion is attributed to the
first touchpoint, in Figure 1, to ‘display ad’, 40% to the last touch, ‘paid channel’ in
our case, and then the rest 20% is equally distributed to each of the touchpoint
channel that became a part of the customer journey. This methodology allows
marketers to assign the maximum importance to the first touchpoint that created
brand awareness and the last touchpoint that eventually led to conversion. It is still
widely used by many of the start-up e-commerce organisations for whom customer
acquisition is as important as conversion.
Position-based, time decay and linear models come close to acknowledge each of the
channel touchpoint involved in the customer journey, but none of these models can be
customised according to the data, with any statistical confidence that would be best suited
for any organisation, industry and/or business model. For their lack of customisation, we
recommend creating and using a data driven attribution model, discussed in subsequent
68
K. Mehta and E. Singhal
sections, to better understand the channel efficacy and incremental value of multi touch
campaign programs.
4
Conventional models
The single touch attribution models like last touch, first touch, last touch non-direct, are
most popular and widely used by marketing teams across different industries. The biggest
advantage of the model is that it simplifies the problem attribution in a way that it is not
just easy to understand, but is also easy to act upon. If your paid search is showcasing
maximum conversions and revenue, it is easy to take a decision to strengthen the channel
which is driving maximum value to your platform like website, mobile site, mobile app,
physical store, or any other platform that you might be having. The challenge is that
each of these models treats conversion as an isolated event, something that happened
when a visitor came from that particular channel at that particular time itself. Any
marketer worth his/her salt would know that conversion is a result of multitude of factors
combining across multiple visits through different channels, touchpoints, experiences and
campaigns. So, treating it as an isolated event and taking all your strategic decisions on
this particular assumption is a gross mistake to say the least.
In this regard the multi-touch models like, linear, position-based and time decay are
much better because they take whole customer journey into account before showcasing
the conversion numbers. Anyone looking into these reports would be able to understand
the contribution of each of the channel and then take decisions to improve qualitative
journey of the customer to get quantitative gains.
The challenge in these models is the fact that they are all rule based, hence none of
the organisations has a way to customise the model according to their data and customer
preferences. These models also pass the onus of selecting the time frame to consider for
allocation to analysts, which can be adjusted only up to 90 days. In industries like real
estate, automobile, education, tourism, etc. where decision making process can be longer
than 90 days, these models lose significance. Further, the decision of selecting the time
frame for attributing conversion, or lookback window, is solely with the analyst, and
seldom that is done by considering any statistical significance, or even scientific
methodology. More often than not, analysts tend to replicate the window that they find in
AdWords, which is 30 days by default. This leaves the results to vary a lot with analyst
preferences and biasness.
5
Employing stochastic models
One thing is for certain, in business, especially in B2C business models, we have a lot of
random variables, with each of them having different degree of significance, or
relevance. In such situations stochastic models are used. Stochastic models take random
variables as input to build a mathematical model, an equation, which gives a probability
of something happening even when the input is seemingly random. There are two kinds
of attribution algorithms that are utilised in marketing are, first marketing mix model
(MMM), and second Markov chain analysis.
Marketing channel attribution modelling
69
MMM is the one of the oldest statistical technique used in the field of marketing. It
depends upon the 4 Ps of marketing which are product, price, promotion, and place. This
model decomposes conversions into two kinds:
1
base conversions, the one which is occurring naturally, due to factors like season,
trends, pricing etc.
2
incremental conversions, driven by promotional activities.
Markov chain analysis on the other hand relies on the logic that future state of a system
depends only on present state of the system and not on the states prior to the current state.
One of the most famous use cases of the technique is gambler’s ruin, used widely by
casinos across the globe. MMM model has been around for a long time. For this reason it
is deeply rooted in the concepts of marketing, and intake huge number of variables. Since
it has been in usage and modified for such a long time, it is more suited for businesses
using traditional marketing strategies, where most of their data is coming through
traditional channels, like newspapers, coupon booklets, discount code, etc. Markov
attribution model on the other hand has been evolved to be used with more contemporary
data, like digital medium, social data, AdWords, clickstream, etc.
5.1 Deciding on the model
When the future state of a system/user/visitor depends upon the current state only, then
Markov chain analysis is used. However, if it depends upon not just the current state, but
also on the previous states, MMM is used. Let us exemplify this, a company’s marketing
team released 10 ‘10% discount’ coupons, 10 ‘25% discount’ coupons and 5 ‘50%
discount’ coupons. If till now it received 5 ‘10% discount’ coupons, 2 ‘25% discount’
coupons, and 3 ‘50% discount’ coupons, what will be the probability of getting a ‘50%
discount’ coupon next, if it just received a ‘25% discount’ coupon? Now, mathematically
probability of the next coupon, or the future state, will not just depend upon which
coupon got redeemed just now, in the current state, but also which coupons got redeemed
earlier as well. In this case MMM model would make more sense. Now let us take
another example, a person landed on a company’s website through paid search, then
through organic search and currently came on the website through direct channel, what is
the propensity of this user landing on your website through display channel? Considering
there is no fixed limit on number of times a visitor can come on company’s platform
through a specific channel, therefore this becomes a classic case of Markov chain
analysis.
6
Markov model: Understanding the transition matrix
To understand the reason why Markov model is chosen over any other stochastic
technique in this paper, we need to understand the Markov property, that is every state of
the system, depends only upon its present state, not on the preceding states. To
understand this with example, suppose a user comes on a website in the below sequence
before buying something worth $100:
Start > Paid Search > Social > Paid Search > Organic Search > Social > Direct > Purchase
70
K. Mehta and E. Singhal
We can see that every visit here leads to another visit through a different channel. Hence
we can create pairs of these channels like (Start, Paid Search); (Paid Search, Social);
(Social, Paid Search); (Paid Search, Organic Search); (Organic Search, Social); (Social,
Direct) and (Direct, End).
Just by ordering these pairs in the order of their first channel, we can create data sets
like (Start, Paid Search); (Organic Search, Social); (Paid Search, Social); (Paid Search,
Organic Search); (Social, Paid Search); (Social, Direct) or (Direct, End).
Now it can be seen that each channel can be followed by certain channels, which can
be showcased like Start: [Paid Search]; Organic Search: [Social]; Paid Search: [Social],
[Organic Search]; Social: [Paid Search], [Direct]; Direct: [End].
Now this data can helps creating a transition diagram, with their respective transition
probabilities:
Figure 2
Transitioning from one state to another (see online version for colours)
These transition probabilities represent the same data set that was created above, in a
visual format for better understanding of knowing a user’s next action, depending upon
his/her current state. According to the data set described above, there is only one path at
start, that is paid search, hence transition probability is 100%. Once a person comes
through paid search, now there are two options that he/she can do; come again on the
platform through social or organic, hence both having 50% probability. Once a person
comes from organic, then there is only one option for him/her based on the sample
journey, that is coming back on website through social channel, hence probability of
100%. Whereas from social channel, there are two options, coming again through direct
channel, or paid search. Once a person comes through direct channel, based on the
sample journey, there is only one option that is ending the journey through a conversion.
Considering Markov model is the only stochastic model that can handle and help
understand the impact of these transitions on final conversions, the same algorithm was
selected to understand the impact of user journeys on our client’s digital platform.
Marketing channel attribution modelling
7
71
Markov chain analysis: High level steps
Markov attribution model is available as a package within most of the statistical analysis
tools. Executing the model is as simple as pressing ‘Ctrl + Enter’ on one line:
markov_model (data, ‘path’, ‘total_conversions’, var_value = ‘total_conversion_value’).
The challenge always is to reach the stage where analyst can apply this code on the data
set created.
At a very high level executing Markov chain analysis involve following steps:
Data extraction and cleaning
↓
Transforming data for creating the input data for Markov algorithm
↓
Executing algorithm
↓
Understanding results
7.1 Data extraction and cleaning
1
Data extraction: Ideally this should be hit by hit data. This data is easily available
through Adobe in form of data feeds, or GA 360, through BigQuery. In case one
doesn’t have access to Google 360 or BigQuery, then one can always use ‘top paths
conversion’ report under Conversions and then Multi-Channel Funnels.
2
Missing value treatment: There are a lot of times when data is not clean, or having its
fair share of issues. Some companies do not have data from certain browsers,
geographies or devices have not been available. For example a company is not
getting data from Chrome 63 version and as per W3 stats; it is still being used by
around 2.3% users across the globe. This means that just by not getting data from
this one source, the company can lose visibility of approximately 2% users, 2% of
revenue, from the database. This 2% differential can create some major discrepancies
in data and our analysis downstream.
In such cases analyst has to make a choice of the missing value treatment:
a
Deleting observations: In case missing data is not as relevant to the data
model, or there are not too many data points that are missing, or deleting the
data points will not introduce any biasness or skewness in data, then these data
points can be deleted.
b
Deleting variables: In case some of the variables are having too many missing
observations, and they are probably not significant to the model then the
variable can be deleted.
72
K. Mehta and E. Singhal
c
Imputation through mean, median or mode: In case the observations of the
value are important to the model and not too many are missing, in these cases
we try to replace missing values through mean, median or mode. In case the
observations, which are not missing, are normally distributed, and are not
skewed (some very large, or very small values), then we replace the missing
values with mean (average) of the observations that are present.
If the data showcases some skewness, though not very large, then median is utilised
to replace the missing values. In case of category variables, the missing values are
replaced by mode (most frequent value in the variable).
3
Outlier treatment: The best method to check if the variables have any outlier is to
calculate the difference between median and mean or the observations. In case the
difference is far lower or far higher, then it suggests that the variable has some
outlier values. At this point of time few things need to be ascertained:
a
If this an error: Many times wrong data is reported, for example, revenue for an
order as 999,999 (there are 9,000,000 ways a 6 digit number using 0–9 can be
formed, out of those there can be only 9 ways that all digits be the same,
probability of one in a million) or 1234 as 345q, which seems like order
number getting tracked in revenue by mistake. In such cases one can remove
these values and treat them as missing values discussed earlier.
b
Capping by 1st or 99th percentile: In case the outliers are few, and less than 1st
percentile, then replace them with 1st percentile value, in case the value is
more than 99th percentile, then replace that by the 99th percentile value.
c
Predict the outliers: This is the most statistically sound technique of outlier
treatment, where the value of the outlier, rather than taking as is, is first
predicted by regression modelling, and then this predicted value is used to
replace the outliers in the data set.
However, analysts should try to never remove the outliers altogether from data.
Figure 3
Identifying outliers (see online version for colours)
73
Marketing channel attribution modelling
7.2 Transforming data for creating the input data for Markov algorithm
Markov chain model expects data to be input in a pre-defined manner. It expects data set
with four columns:
var_path: the stacked channel/campaign path that the user/visitor has taken in their
visits/sessions till now
As an example in the Figure 1, the path would look like:
Display > Natural > Paid > Direct
var_conv (orders/conversions): number of conversions that has happened for respective
paths
var_value (revenue/conversion value): the value (notional value in case of goals within
GA) that we have earned from our conversions can be revenue
var_null: the number of times there has been no conversion for the respective path
One will notice that data is not required on the user level, channel level, but on the path
level. By path name we mean every path (combination of channels) that every user has
traversed, whether that combination of channels resulted in conversion or not. Each
channel has to be separated with ‘>‘ sign.
Sample data looks like:
Table 1
Creating raw dataset
Path
Natural_search>Natural_search
Total_conversions Total_conversions_value Total_null
550
20,4679.81
126,837
Paid_search>Paid_search
342
91,195.84
27,531
Partners
340
81,243.96
94,586
Affiliates
275
74,890.89
77,157
Natural_search>Natural_search>Natural_search
162
82,903.34
29,703
Paid_search>Referring_domains
153
39,404.75
9,120
Referring_domains>Referring_domains
126
28,518.18
46,960
Direct>Direct
114
43,059.62
38,309
Table 1 is created on the raw data set, by creating var_path, and calculating the various
metrics of conversions, conversion value and null conversions, as explained above. This
file needs to have path as the unique key, which contains number of orders against each
path as var_conv, revenue as var_value, and number of times path did not result in
conversion as var_null. Once we have created this input data file, we can easily apply
Markov model available within package.
7.3 Executing the model
For executing the Markov analysis in R program, we used the native library of
‘markovchain’.
74
K. Mehta and E. Singhal
ATTR_MARKOV < − MARKOV_MODEL(FINAL_DATA, ' PATH ',
'TOTAL_CONVERSIONS', VAR_VALUE
= 'TOTAL_CONVERSION_VALUE')
The function ‘markov_model’ can help us execute the model and get the desired results.
The data set that created in the earlier steps will have the required columns and will be
able to execute the model seamlessly.
7.4 Understanding results
Markov analysis provides two kinds of outputs:
1
Transition matrix: it provides mathematical probability of the next step, when the
current step is given.
Example output:
Figure 4
Transition matrix (see online version for colours)
This way we can determine what are the best possible channel paths that we should
plan for our customers and how should we lead them to our website to maximise the
probability of conversion. To understand the transition matrix showcased above, for
all the bookings that we have had on our website, the probability of conversion is
30.62% if a user comes from affiliate channel, but this user’ most probable next step
will be to come from ‘natural search’. What it also shows is that probability of
conversion is maximum when a user comes on our website from an internal platform
like internal blog, support site, or a campaign microsite. At the same time probability
of conversion is drastically lesser when a user comes from a paid campaign on social
media channels.
2
Attribution of conversions: in addition to the transition, it also provides the
statistically calculated attribution of conversions to channels.
Table 2 shows the contribution of each channel in terms of the number of orders, and
revenue earned from the users that came from these channels, along with their
average order value. It clearly shows that high value buyers are coming through
apps, which again motivates us to engage more with current customers and
perspective customers through mobile apps and run campaigns for app downloads.
75
Marketing channel attribution modelling
At the same time, social media is under-performing on all major metrics like
conversions, revenue and average order value. This way value of each channel and
the incremental gains from them can be understood, to further optimise the campaign
programs and thereby reducing marketing costs.
Table 2
Attributing conversions and revenue
Channel name
Custom
% Conversions
% Rev
Rev/Order ($)
Affiliates
1.81%
1.88%
482.21
Direct
14.74%
15.89%
499.98
Email
0.24%
0.22%
428.52
Display
0.20%
0.20%
455.32
App
0.08%
0.12%
616.87
Internal
8.03%
7.01%
404.51
Natural_Search
37.12%
42.26%
528.22
Paid_Search
18.85%
18.18%
447.35
Partners
3.88%
3.61%
433.03
Referring_Domains
34.84%
30.49%
406.00
Social_Networks_organic
0.17%
0.12%
310.37
Social_Networks_paid
0.04%
0.02%
355.88
8
Conclusions
This study proposes a choice of attribution framework based on Markov chain analysis in
order to solve the problem of attribution. Higher order models allow new data driven
insights into the interplay of channels in multi-channel environment. This paper aligns
with the previous studies (Berman, 2018; Li and Kannan, 2014), indicating that last-click
attribution measures are neither appropriate to determine the factual contribution of
individual channels nor to determine interplay of multi-channels, and highlighting the
relevance of Markov attribution model, along with its method of implementation. For
online advertising, having the right attribution model is crucial as it drives performance
metric and advertising insights. The study illustrates the importance of attribution so that
marketers can accurately determine the credit of each channel to the overall conversions.
It has important managerial implications for allocating marketing budgets across various
marketing channels.
9
Implications
One of the most critical challenges that marketers face after creating the attribution model
is, the allocation of marketing budget for marketing channels according to the newly
driven insights. The answer always is resounding yes. Marketers need to ensure that they
are able to monitor ROI (return on investment) of their changed marketing strategies and
sufficient data is collected from new strategies to take any meaningful decision.
76
K. Mehta and E. Singhal
Attribution analysis is probably one of the most challenging aspects of digital analytics,
and it can help marketers find true ROI of each of the channel, efficacy of campaigns,
channels and campaigns which are better at acquisition and conversions to create media
sequence plan, create campaign calendar, identify the high performing and under-valued
channels, incentivise media teams to improve the performance at a holistic level and
ascertain value of their strategic campaigns through more tactical data. Though there are
many attribution models available as out of box solution within many analytics tools, and
different industries have their own way of measuring success, but any data measurement
methodology, without the statistical vetting is as good as having no measurement strategy
at all.
References
Agichtein, E., Brill, E., Dumais, S. and Ragno, R. (2006) ‘Learning user interaction models for
predicting web search result preferences’, in Proceedings of the 29th Annual International
ACM SIGIR Conference on Research and Development in Information Retrieval, August,
pp.3–10.
Ailawadi, K.L. and Farris, P.W. (2017) ‘Managing multi-and omni-channel distribution: metrics
and research directions’, Journal of Retailing, Vol. 93, No. 1, pp.120–135.
Anderl, E., Becker, I., von Wangenheim, F. and Schumann, J.H. (2016) ‘Mapping the customer
journey: Lessons learned from graph-based online attribution modeling’, International Journal
of Research in Marketing, Vol. 33, No. 3, pp.457–474.
Bell, D.R., Gallino, S. and Moreno, A. (2014) ‘How to win in an omnichannel world’, MIT Sloan
Management Review, Vol. 56, No. 1, p.45.
Berman, R. (2018) ‘Beyond the last touch: Attribution in online advertising’, Marketing Science,
Vol. 37, No. 5, pp.771–792.
Chandler-Pepelnjak, J. (2009) Measuring ROI Beyond the Last Ad: Winners and Losers in the
Purchase Funnelare Different When Viewed Through a New Lens’, Microsoft Advertising
Institute.
Dalessandro, B., Perlich, C., Stitelman, O. and Provost, F. (2012) ‘Causally motivated attribution
for online advertising’, in Proceedings of the Sixth International Workshop on Data Mining
for Online Advertising and Internet Economy, ACM, p.7.
Danaher, P.J. and van Heerde, H.J. (2018) ‘Delusion in attribution: caveats in using attribution for
multimedia budget allocation’, Journal of Marketing Research, Vol. 55, No. 5, pp.667–685.
De Haan, E., Wiesel, T. and Pauwels, K. (2016) The effectiveness of different forms of online
advertising for purchase conversion in a multiple-channel attribution framework’,
International Journal of Research in Marketing, Vol. 33, No. 3, pp.491–507.
Goldfarb, A. and Tucker, C. (2011) ‘Online display advertising. Targeting and obtrusiveness’,
Marketing Science, Vol. 30, No. 3, pp.389–404.
Jordan, P., Mahdian, M., Vassilvitskii, S. and Vee, E. (2011) ‘The multiple attribution problem in
pay-per-conversion advertising’, in International Symposium on Algorithmic Game Theory,
Springer, Berlin, Heidelberg, pp. 31-43.
Kannan, P.K. (2017) ‘Digital marketing: a framework, review and research agenda’, International
Journal of Research in Marketing, Vol. 34, No.1, pp.22–45.
Kireyev, P., Pauwels, K. and Gupta, S. (2016) ‘Do display ads influence search? Attribution and
dynamics in online advertising’, International Journal of Research in Marketing, Vol. 33,
No. 3, pp.475–490.
Kitts, B., Wei, L., Au, D., Powter, A. and Burdick, B. (2010) ‘Attribution of conversion events to
multi-channel media’, in 2010 IEEE International Conference on Data Mining, pp.881–886.
Marketing channel attribution modelling
77
Lee, G. (2010) ‘Death of ‘last click wins’: media attribution and the expanding use of media data’,
Journal of Direct, Data and Digital Marketing Practice, Vol. 12, No. 1, pp.16–26.
Lewis, R.A., Rao, J.M. and Reiley, D.H. (2011) ‘Here, there, and everywhere: correlated online
behaviors can lead to overestimates of the effects of advertising’, in Proceedings of the 20th
International Conference on World Wide Web, March, pp.157–166.
Li, H. and Kannan, P.K. (2014) ‘Attributing conversions in a multichannel online marketing
environment: An empirical model and a field experiment’, Journal of Marketing Research,
Vol. 51, No. 1, pp.40–56.
Montoya, R., Netzer, O. and Jedidi, K. (2010) ‘Dynamic allocation of pharmaceutical detailing and
sampling for long-term profitability’, Marketing Science, Vol. 29, No. 5, pp.909–924.
Necsulescu, N. (2015) ‘Multi-channel attribution and its role in marketing investment’, Journal of
Digital and Social Media Marketing, Vol. 3, No. 2, pp.125–134.
Neslin, S.A. and Shankar, V. (2009) ‘Key issues in multichannel customer management: current
knowledge and future directions’, Journal of Interactive Marketing, Vol. 23, No. 1, pp.70–81.
Neslin, S.A., Grewal, D., Leghorn, R., Shankar, V., Teerling, M.L., Thomas, J.S. and Verhoef, P.C.
(2006) ‘Challenges and opportunities in multichannel customer management’, Journal of
Service Research, Vol. 9, No. 2, pp.95–112.
Netzer, O., Lattin, J.M. and Srinivasan, V. (2008) ‘A hidden Markov model of customer
relationship dynamics’, Marketing Science, Vol. 27, No. 2, pp.185–204.
Osur, A., Riley, E., Moffett, T., Glass, S. and Komar, E. (2012) The Forrester Wave Interactive
Attribution Vendors Q2 2012, Forrester White Paper.
Raman, K., Mantrala, M.K., Sridhar, S. and Tang, Y.E. (2012) ‘Optimal resource allocation with
time-varying marketing effectiveness, margins and costs’, Journal of Interactive Marketing,
Vol. 26, No. 1, pp.43–52.
Rana, S. (2019) ‘Moving in the realm of big data: using analytics in management research and
practices’, FIIB Business Review, Vol. 8, No. 1, pp.7–8.
Rangaswamy, A. and van Bruggen, G.H. (2005) ‘Opportunities and challenges in multichannel
marketing: An introduction to the special issue’, Journal of Interactive Marketing, Vol. 19,
No. 2, pp.5–11.
Rossi, P.S., Romano, G., Palmieri, F. and Iannello, G. (2003) ‘A hidden Markov model for internet
channels’, in Proceedings of the 3rd IEEE International Symposium on Signal Processing and
Information Technology (IEEE Cat. No. 03EX795), IEEE, pp.50–53.
Schwartz, E.M., Bradlow, E., Fader, P. and Zhang, Y. (2011) ‘Children of the HMM’: Modeling
Longitudinal Customer Behavior at Hulu.com, WCAI Working Paper Series.
Shao, X. and Li, L. (2011) ‘Data-driven multi-touch attribution models’, in Proceedings of the 17th
ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM,
pp.258–264.
Sinha, R., Saini, S. and Anadhavelu, N. (2014) ‘Estimating the incremental effects of interactions
for marketing attribution’, in 2014 International Conference on Behavioral, Economic, and
Socio-Cultural Computing, pp.1–6.
Wiesel, T., Pauwels, K. and Arts, J. (2011) ‘Practice prize paper – marketing’s profit impact:
Quantifying online and off-line funnel progression’, Marketing Science, Vol. 30, No. 4,
pp.604–611.
Xu, L., Duan, J.A. and Whinston, A. (2014) ‘Path to purchase: A mutually exciting point
process model for online advertising and conversion’, Management Science, Vol. 60, No. 6,
pp.1392–1412.