Academia.eduAcademia.edu

Marketing channel attribution modelling: Markov chain analysis

2020, International Journal of Indian Culture and Business Management

With the advent of digital era the business landscape has evolved drastically thereby impacting all the marketing and advertising activities. Advertisers employ multiple channels to reach the customers on digital platform. Now the challenge has come up to design methodology to attribute conversions to these multiple channels in order to measure ROI (return on investment) and optimise the allocation of media budget. The problem gets compounded on digital platform where people tend to visit multiple times through multiple channels before each conversion. Conventional models of first touch, last touch and linear attribution do not give statistically complete picture, but at the same time, there are not enough resources outside which helps to implement a model like Markov attribution model to get statistically sound attribution and analysis of conversions. This paper aims to provide a high-level overview of different attribution models provided within some of the most prominent tools like Adobe Analytics and Google Analytics. At the same time the paper builds the case for more statistically sound model like 'Markov analysis' to showcase how and why it is better than traditional models.

Int. J. Indian Culture and Business Management, Vol. 21, No. 1, 2020 Marketing channel attribution modelling: Markov chain analysis Kunal Mehta Manager Data Analytics, Marketing, Publicis Sapient, Gurugram, India Email: [email protected] Ekta Singhal* Faculty of Marketing, Fortune Institute of International Business, New Delhi, India Email: [email protected] *Corresponding author Abstract: With the advent of digital era the business landscape has evolved drastically thereby impacting all the marketing and advertising activities. Advertisers employ multiple channels to reach the customers on digital platform. Now the challenge has come up to design methodology to attribute conversions to these multiple channels in order to measure ROI (return on investment) and optimise the allocation of media budget. The problem gets compounded on digital platform where people tend to visit multiple times through multiple channels before each conversion. Conventional models of first touch, last touch and linear attribution do not give statistically complete picture, but at the same time, there are not enough resources outside which helps to implement a model like Markov attribution model to get statistically sound attribution and analysis of conversions. This paper aims to provide a high-level overview of different attribution models provided within some of the most prominent tools like Adobe Analytics and Google Analytics. At the same time the paper builds the case for more statistically sound model like ‘Markov analysis’ to showcase how and why it is better than traditional models. Keywords: marketing strategies; marketing channel; digital platform; channel attribution model; Markov analysis; web analytics. Reference to this paper should be made as follows: Mehta, K. and Singhal, E. (2020) ‘Marketing channel attribution modelling: Markov chain analysis’, Int. J. Indian Culture and Business Management, Vol. 21, No. 1, pp.63–77. Biographical notes: Kunal Mehta has worked as a core member of analytics CoE teams to ideate, innovate, develop and implement integrated analytics solution to address business problems ranging from attrition rate, to cross sell, to design and execute marketing strategies. He has also worked in areas of reporting to develop, measure, analyze, derive insights and action on the data driven information presented in visually enriching technology solutions like Tableau, Qlikview, Power BI etc. Currently he is working with Publicis Sapient, as one of the data leads for UK geography, helping clients from diverse industries like Retail, Hospitality, Education, Automobile, Pharma, BFSi etc. to extract maximum value from data to solve their business problems. Copyright © 2020 Inderscience Enterprises Ltd. 63 64 K. Mehta and E. Singhal Ekta Singhal has an excellent academic record with a great aptitude towards teaching and research. She has obtained her PhD in Marketing Management from the Department of Commerce and Business Administration (MONIRBA), University of Allahabad, Master in Business Administration with Marketing specialisation; and Bachelor in Commerce from Hansraj College, Delhi University. She has been awarded with various academic scholarships. She has presented her research papers in various conferences held all over India. She is an avid reader and is keenly aware of new and emerging issues in Marketing and Branding. 1 Introduction The rapid expansion of communication technologies has significantly increased the opportunity for customers to engage with brands whenever and wherever they choose (Rangaswamy and van Bruggen, 2005). More number of people has access to mobile phones and internet thereby leading to widespread usage of digital across various sectors. As more number of people started making their purchase decisions online, marketers started adopting digital advertising strategies. India’s digital advertisement market is expected to grow at a compound annual growth rate (CAGR) of 33.5% to cross the Rs 25,500 crore (US$ 3.8 billion) mark by 2020. This growth has created gamut of opportunities for marketers to reach their audience in innovative ways like search ads, display ads, and social media. One consequence of increasingly diverse array of customer touchpoints is the need to seamlessly integrate communication strategies across various channels (Neslin and Shankar, 2009). This need has been magnified in recent years because of the increased ability of consumers to select the channels they use (Bell et al., 2014). Increasing customer touchpoints and digital channels has seriously complicated the process of measuring the degree to which each channel contributes to conversions. This raises the key question of attribution: Which particular ad gets credit for a conversion and how much credit does each of these ads get? This is one of the most important questions facing the marketers today. Multichannel is considered as the design, deployment, coordination, and evaluation of the different channels through which the firms interact with their customers thereby aiming to increase customer value (Neslin et al., 2006). It focuses on handling and enhancing the performance of each channel (Ailawadi and Farris, 2017). In order to improve interactions with customers the key starting point is to understand what those interactions are and where they take place. Without that understanding it would be impossible to measure any improvements or to see if changes made to those interactions were having a detrimental effect or positive effect. In this whole customer digital journey, every channel, campaign, experience has a role to play, just like in football match, where no one player decides the fate of the game, similar to that, in digital journey, not one channel can be attributed for the whole conversion. This is why we have multiple attribution models and methodologies that help us divide the credit of conversion to each of the touchpoint channels in one way or the other. Some models are more scientific than others, but there is no universal out of box model. The study aims to showcase the relevance of data driven model, and in particular Markov Attribution Algorithm. The readers should be able to choose the right attribution approach and learn how to Marketing channel attribution modelling 65 implement Markov model on the data extracted from any web analytics tool like Google Analytics or Adobe Analytics, implemented on their digital platform like desktop website, mobile site and/or mobile app. 2 Background of the study With the advent of digital era the business landscape has evolved drastically thereby impacting all the marketing and advertising activities. Technology has remarkably changed the market place leading to the birth of various channels of interaction between customers and companies. There is an avalanche of information in the form of data that is easily available for marketers to use (Rana, 2019). Digital media consumption has increased tremendously through desktop, laptop, mobile and tablet interactions (Necsulescu, 2015). Advertisers employ multiple channels to reach the customers on digital platform (Anderl et al., 2014). This has led to a substantial rise in online advertising as a crucial tool in the promotion mix of many industries (Raman et al., 2012). Automated collection of the enormous data sets from customers’ use of the internet creates an opportunity for marketers and analysts to target their marketing campaigns better and optimal utilisation of marketing budget (Goldfarb and Tucker, 2011). Digital advertising media includes display, e-mail, search, social media, affiliates, mobile and app (Shao and Li, 2011). Also, there are instances when customers visit the company’s website on their own commonly called as organic search (Li and Kannan, 2014). Marketers use numerous channels to reach customers in their customer journey (Anderl et al., 2016). This poses a serious dilemma of attribution to the marketers (Anderl et al., 2014; Dalessandro et al., 2012; Jordan et al., 2011d; Kitts et al., 2010; Lee, 2010; Lewis et al., 2011; Wiesel et al., 2011). For most of the traditional advertising media there is no direct way to attribute sales to advertising. However, in case of digital advertising both the delivery and interactions can be tracked (Sinha et al., 2014). Marketers have been able to identify the inadequacies in the current attribution methodologies (Chandler-Pepelnjak, 2009). And as a result, they acknowledge that measurement of effectiveness of a particular advertising campaign and optimum allocation of advertising budget was and still remains a very complicated task. However, the digital media has made the task easier for marketing practitioners by connecting ad impressions to user actions, interactions such as a search query, clicking on an ad or converting (Agichtein et al., 2006). In digital advertising, dilemma of attribution is the problem of assigning credit to one or more advertisements for leading to a conversion. Many attribution models have been developed to address this issue. Simple heuristics (last-touch attribution) model is the simplest but full or errors. In this model, credit is assigned to the last ad/channel which user interacts with preceding the conversion. It causes ads that appear much earlier in the customer journey to receive less credit and ads that occur closer to the conversion to receive most of the credit for the desirable action thereby causing incorrect attribution and leading to sub optimal advertising budget allocation. De Haan et al. (2016) found that content-integrated advertising is the most effective form and last-touch attribution underestimates content-integrated activities. Multi-touch attribution model allows more than one ad to get credit for conversion based on their contribution. Kannan (2017) highlighted that the attribution models can 66 K. Mehta and E. Singhal facilitate in providing insights for allocating marketing investments across multiple channels. Osur et al. (2012) states that the two most important objectives of attribution models, i.e., measurement of value and performance of digital channels and measurement of impact of one digital channel on the performance of another channel. Multi-touch attribution is one of the crucial problems of digital advertising and is gaining importance both in practice and academia (Berman, 2018). Previous studies have proposed numerous analytical attribution frameworks, like simple probabilistic model and bagged logistic regression model (Shao and Li, 2011), individual-level probabilistic model (Li and Kannan, 2014), mutually exciting point process models (Xu et al., 2014), game theoretical model (Berman, 2018 Dalessandro et al., 2012), multivariate time series model (Kireyev et al., 2016), and hidden Markov model (HMM). Rossi et al. (2003) introduced joint stochastic modelling of losses and delays in the HMM description of real channels as HMM is able to automatically capture various hidden states the dynamics of network congestion status. Netzer et al. (2008) developed HMM to capture the dynamics of customer relationships incorporating the effect of multiple customer touchpoints on the subsequent buying behaviour. HMMs have been helpful in studying physicians’ prescriptive behaviour (Montoya et al., 2010), and online viewing behaviour (Schwartz et al., 2011). Danaher and van Heerde (2018) proposed an attribution formulation which shows that attribution is proportional to the marginal effectiveness of a medium times its number of exposures. Marketers need to develop and apply an attribution model which can help in ascertaining the effectiveness of individual marketing channels and the interaction of channels in a multichannel environment across various data sets and situations. 3 Attribution models and methodologies In the age of proliferation of digital media, marketers communicate with customers through various touchpoints in order to generate leads and conversions. In order to understand such complex communications, analytics and advanced tools come in to play (Necsulescu, 2015). To analyse, leverage and respond to multi-channel touchpoints, attribution models need to make necessary interpretation of data. An attribution model is the rule, or set of rules, that determines how credit for sales and conversions is assigned to touchpoints in conversion paths. To understand the most common attribution methodologies and process, let us consider a journey of a customer as highlighted in the Figure 1 below. A user comes on your website from a ‘display ad’, then through ‘natural search’, then through ‘paid search’ and finally through ‘direct’ channel before purchasing something for $100. Figure 1 A typical digital journey (see online version for colours) Marketing channel attribution modelling 67 Some of the most widely used attribution models are: 1 Last touch: where all credit of the conversion is attributed to the last channel, in Figure 1, to the ‘direct’. It is by far the most widely used attribution model. 2 First touch: where all credit of the conversion is attributed to the first channel, in above figure, to the ‘display ad’. Widely used by organisations for whom customer acquisition is paramount. Most of the start-up organisations rely on this in their early phases. 3 Linear: in today’s digital world, the average transaction can have more than 25 touchpoints. The linear attribution model distributes credit of the conversion in equal proportion to all the channels which participated in the visitor journey, in this case to all touchpoints, and all channels; so if there are four touchpoints leading to a conversion, each touchpoint gets 25% of total attribution. This is the ‘safe’ option of attribution models which simply give equal credit to each of the touchpoints. 4 Last touch non-direct: if there is a case where last touch is ‘direct’, the credit of the conversion is passed on to the channel before ‘direct’. In Figure 1, it will be passed on to ‘paid search’ instead. In most of the prominent web analytics tools like Adobe and Google, default attribution is always last touch, non-direct. Since most of the direct traffic is a result of their interaction with our platform after coming through various other channels, this attribution model makes most sense to the marketing teams. 5 Time decay: in the time decay attribution model, attribution is measures in accordance to when the visitor interacted with it. In this methodology last touch channel gets the maximum credit, then the second last channel will get lesser than the last, third last will get lesser than second last channel, so on and so forth. In this case first touch gets least credit and the touchpoints closer to conversion are seen as more valuable. For example, if there are three touchpoints in this order: YouTube, Twitter, and Google. Google, being the last touch, would receive maximum credit, i.e., 50% of the attribution, Twitter 30%, and YouTube 20%. This can be good model for organisations which heavily depend upon promotional offers to acquire and convert customers. 6 Position-based: the position-based attribution model combines linear and time-decay models. In this model 40% of the credit of each of the conversion is attributed to the first touchpoint, in Figure 1, to ‘display ad’, 40% to the last touch, ‘paid channel’ in our case, and then the rest 20% is equally distributed to each of the touchpoint channel that became a part of the customer journey. This methodology allows marketers to assign the maximum importance to the first touchpoint that created brand awareness and the last touchpoint that eventually led to conversion. It is still widely used by many of the start-up e-commerce organisations for whom customer acquisition is as important as conversion. Position-based, time decay and linear models come close to acknowledge each of the channel touchpoint involved in the customer journey, but none of these models can be customised according to the data, with any statistical confidence that would be best suited for any organisation, industry and/or business model. For their lack of customisation, we recommend creating and using a data driven attribution model, discussed in subsequent 68 K. Mehta and E. Singhal sections, to better understand the channel efficacy and incremental value of multi touch campaign programs. 4 Conventional models The single touch attribution models like last touch, first touch, last touch non-direct, are most popular and widely used by marketing teams across different industries. The biggest advantage of the model is that it simplifies the problem attribution in a way that it is not just easy to understand, but is also easy to act upon. If your paid search is showcasing maximum conversions and revenue, it is easy to take a decision to strengthen the channel which is driving maximum value to your platform like website, mobile site, mobile app, physical store, or any other platform that you might be having. The challenge is that each of these models treats conversion as an isolated event, something that happened when a visitor came from that particular channel at that particular time itself. Any marketer worth his/her salt would know that conversion is a result of multitude of factors combining across multiple visits through different channels, touchpoints, experiences and campaigns. So, treating it as an isolated event and taking all your strategic decisions on this particular assumption is a gross mistake to say the least. In this regard the multi-touch models like, linear, position-based and time decay are much better because they take whole customer journey into account before showcasing the conversion numbers. Anyone looking into these reports would be able to understand the contribution of each of the channel and then take decisions to improve qualitative journey of the customer to get quantitative gains. The challenge in these models is the fact that they are all rule based, hence none of the organisations has a way to customise the model according to their data and customer preferences. These models also pass the onus of selecting the time frame to consider for allocation to analysts, which can be adjusted only up to 90 days. In industries like real estate, automobile, education, tourism, etc. where decision making process can be longer than 90 days, these models lose significance. Further, the decision of selecting the time frame for attributing conversion, or lookback window, is solely with the analyst, and seldom that is done by considering any statistical significance, or even scientific methodology. More often than not, analysts tend to replicate the window that they find in AdWords, which is 30 days by default. This leaves the results to vary a lot with analyst preferences and biasness. 5 Employing stochastic models One thing is for certain, in business, especially in B2C business models, we have a lot of random variables, with each of them having different degree of significance, or relevance. In such situations stochastic models are used. Stochastic models take random variables as input to build a mathematical model, an equation, which gives a probability of something happening even when the input is seemingly random. There are two kinds of attribution algorithms that are utilised in marketing are, first marketing mix model (MMM), and second Markov chain analysis. Marketing channel attribution modelling 69 MMM is the one of the oldest statistical technique used in the field of marketing. It depends upon the 4 Ps of marketing which are product, price, promotion, and place. This model decomposes conversions into two kinds: 1 base conversions, the one which is occurring naturally, due to factors like season, trends, pricing etc. 2 incremental conversions, driven by promotional activities. Markov chain analysis on the other hand relies on the logic that future state of a system depends only on present state of the system and not on the states prior to the current state. One of the most famous use cases of the technique is gambler’s ruin, used widely by casinos across the globe. MMM model has been around for a long time. For this reason it is deeply rooted in the concepts of marketing, and intake huge number of variables. Since it has been in usage and modified for such a long time, it is more suited for businesses using traditional marketing strategies, where most of their data is coming through traditional channels, like newspapers, coupon booklets, discount code, etc. Markov attribution model on the other hand has been evolved to be used with more contemporary data, like digital medium, social data, AdWords, clickstream, etc. 5.1 Deciding on the model When the future state of a system/user/visitor depends upon the current state only, then Markov chain analysis is used. However, if it depends upon not just the current state, but also on the previous states, MMM is used. Let us exemplify this, a company’s marketing team released 10 ‘10% discount’ coupons, 10 ‘25% discount’ coupons and 5 ‘50% discount’ coupons. If till now it received 5 ‘10% discount’ coupons, 2 ‘25% discount’ coupons, and 3 ‘50% discount’ coupons, what will be the probability of getting a ‘50% discount’ coupon next, if it just received a ‘25% discount’ coupon? Now, mathematically probability of the next coupon, or the future state, will not just depend upon which coupon got redeemed just now, in the current state, but also which coupons got redeemed earlier as well. In this case MMM model would make more sense. Now let us take another example, a person landed on a company’s website through paid search, then through organic search and currently came on the website through direct channel, what is the propensity of this user landing on your website through display channel? Considering there is no fixed limit on number of times a visitor can come on company’s platform through a specific channel, therefore this becomes a classic case of Markov chain analysis. 6 Markov model: Understanding the transition matrix To understand the reason why Markov model is chosen over any other stochastic technique in this paper, we need to understand the Markov property, that is every state of the system, depends only upon its present state, not on the preceding states. To understand this with example, suppose a user comes on a website in the below sequence before buying something worth $100: Start > Paid Search > Social > Paid Search > Organic Search > Social > Direct > Purchase 70 K. Mehta and E. Singhal We can see that every visit here leads to another visit through a different channel. Hence we can create pairs of these channels like (Start, Paid Search); (Paid Search, Social); (Social, Paid Search); (Paid Search, Organic Search); (Organic Search, Social); (Social, Direct) and (Direct, End). Just by ordering these pairs in the order of their first channel, we can create data sets like (Start, Paid Search); (Organic Search, Social); (Paid Search, Social); (Paid Search, Organic Search); (Social, Paid Search); (Social, Direct) or (Direct, End). Now it can be seen that each channel can be followed by certain channels, which can be showcased like Start: [Paid Search]; Organic Search: [Social]; Paid Search: [Social], [Organic Search]; Social: [Paid Search], [Direct]; Direct: [End]. Now this data can helps creating a transition diagram, with their respective transition probabilities: Figure 2 Transitioning from one state to another (see online version for colours) These transition probabilities represent the same data set that was created above, in a visual format for better understanding of knowing a user’s next action, depending upon his/her current state. According to the data set described above, there is only one path at start, that is paid search, hence transition probability is 100%. Once a person comes through paid search, now there are two options that he/she can do; come again on the platform through social or organic, hence both having 50% probability. Once a person comes from organic, then there is only one option for him/her based on the sample journey, that is coming back on website through social channel, hence probability of 100%. Whereas from social channel, there are two options, coming again through direct channel, or paid search. Once a person comes through direct channel, based on the sample journey, there is only one option that is ending the journey through a conversion. Considering Markov model is the only stochastic model that can handle and help understand the impact of these transitions on final conversions, the same algorithm was selected to understand the impact of user journeys on our client’s digital platform. Marketing channel attribution modelling 7 71 Markov chain analysis: High level steps Markov attribution model is available as a package within most of the statistical analysis tools. Executing the model is as simple as pressing ‘Ctrl + Enter’ on one line: markov_model (data, ‘path’, ‘total_conversions’, var_value = ‘total_conversion_value’). The challenge always is to reach the stage where analyst can apply this code on the data set created. At a very high level executing Markov chain analysis involve following steps: Data extraction and cleaning ↓ Transforming data for creating the input data for Markov algorithm ↓ Executing algorithm ↓ Understanding results 7.1 Data extraction and cleaning 1 Data extraction: Ideally this should be hit by hit data. This data is easily available through Adobe in form of data feeds, or GA 360, through BigQuery. In case one doesn’t have access to Google 360 or BigQuery, then one can always use ‘top paths conversion’ report under Conversions and then Multi-Channel Funnels. 2 Missing value treatment: There are a lot of times when data is not clean, or having its fair share of issues. Some companies do not have data from certain browsers, geographies or devices have not been available. For example a company is not getting data from Chrome 63 version and as per W3 stats; it is still being used by around 2.3% users across the globe. This means that just by not getting data from this one source, the company can lose visibility of approximately 2% users, 2% of revenue, from the database. This 2% differential can create some major discrepancies in data and our analysis downstream. In such cases analyst has to make a choice of the missing value treatment: a Deleting observations: In case missing data is not as relevant to the data model, or there are not too many data points that are missing, or deleting the data points will not introduce any biasness or skewness in data, then these data points can be deleted. b Deleting variables: In case some of the variables are having too many missing observations, and they are probably not significant to the model then the variable can be deleted. 72 K. Mehta and E. Singhal c Imputation through mean, median or mode: In case the observations of the value are important to the model and not too many are missing, in these cases we try to replace missing values through mean, median or mode. In case the observations, which are not missing, are normally distributed, and are not skewed (some very large, or very small values), then we replace the missing values with mean (average) of the observations that are present. If the data showcases some skewness, though not very large, then median is utilised to replace the missing values. In case of category variables, the missing values are replaced by mode (most frequent value in the variable). 3 Outlier treatment: The best method to check if the variables have any outlier is to calculate the difference between median and mean or the observations. In case the difference is far lower or far higher, then it suggests that the variable has some outlier values. At this point of time few things need to be ascertained: a If this an error: Many times wrong data is reported, for example, revenue for an order as 999,999 (there are 9,000,000 ways a 6 digit number using 0–9 can be formed, out of those there can be only 9 ways that all digits be the same, probability of one in a million) or 1234 as 345q, which seems like order number getting tracked in revenue by mistake. In such cases one can remove these values and treat them as missing values discussed earlier. b Capping by 1st or 99th percentile: In case the outliers are few, and less than 1st percentile, then replace them with 1st percentile value, in case the value is more than 99th percentile, then replace that by the 99th percentile value. c Predict the outliers: This is the most statistically sound technique of outlier treatment, where the value of the outlier, rather than taking as is, is first predicted by regression modelling, and then this predicted value is used to replace the outliers in the data set. However, analysts should try to never remove the outliers altogether from data. Figure 3 Identifying outliers (see online version for colours) 73 Marketing channel attribution modelling 7.2 Transforming data for creating the input data for Markov algorithm Markov chain model expects data to be input in a pre-defined manner. It expects data set with four columns: var_path: the stacked channel/campaign path that the user/visitor has taken in their visits/sessions till now As an example in the Figure 1, the path would look like: Display > Natural > Paid > Direct var_conv (orders/conversions): number of conversions that has happened for respective paths var_value (revenue/conversion value): the value (notional value in case of goals within GA) that we have earned from our conversions can be revenue var_null: the number of times there has been no conversion for the respective path One will notice that data is not required on the user level, channel level, but on the path level. By path name we mean every path (combination of channels) that every user has traversed, whether that combination of channels resulted in conversion or not. Each channel has to be separated with ‘>‘ sign. Sample data looks like: Table 1 Creating raw dataset Path Natural_search>Natural_search Total_conversions Total_conversions_value Total_null 550 20,4679.81 126,837 Paid_search>Paid_search 342 91,195.84 27,531 Partners 340 81,243.96 94,586 Affiliates 275 74,890.89 77,157 Natural_search>Natural_search>Natural_search 162 82,903.34 29,703 Paid_search>Referring_domains 153 39,404.75 9,120 Referring_domains>Referring_domains 126 28,518.18 46,960 Direct>Direct 114 43,059.62 38,309 Table 1 is created on the raw data set, by creating var_path, and calculating the various metrics of conversions, conversion value and null conversions, as explained above. This file needs to have path as the unique key, which contains number of orders against each path as var_conv, revenue as var_value, and number of times path did not result in conversion as var_null. Once we have created this input data file, we can easily apply Markov model available within package. 7.3 Executing the model For executing the Markov analysis in R program, we used the native library of ‘markovchain’. 74 K. Mehta and E. Singhal ATTR_MARKOV < − MARKOV_MODEL(FINAL_DATA, ' PATH ', 'TOTAL_CONVERSIONS', VAR_VALUE = 'TOTAL_CONVERSION_VALUE') The function ‘markov_model’ can help us execute the model and get the desired results. The data set that created in the earlier steps will have the required columns and will be able to execute the model seamlessly. 7.4 Understanding results Markov analysis provides two kinds of outputs: 1 Transition matrix: it provides mathematical probability of the next step, when the current step is given. Example output: Figure 4 Transition matrix (see online version for colours) This way we can determine what are the best possible channel paths that we should plan for our customers and how should we lead them to our website to maximise the probability of conversion. To understand the transition matrix showcased above, for all the bookings that we have had on our website, the probability of conversion is 30.62% if a user comes from affiliate channel, but this user’ most probable next step will be to come from ‘natural search’. What it also shows is that probability of conversion is maximum when a user comes on our website from an internal platform like internal blog, support site, or a campaign microsite. At the same time probability of conversion is drastically lesser when a user comes from a paid campaign on social media channels. 2 Attribution of conversions: in addition to the transition, it also provides the statistically calculated attribution of conversions to channels. Table 2 shows the contribution of each channel in terms of the number of orders, and revenue earned from the users that came from these channels, along with their average order value. It clearly shows that high value buyers are coming through apps, which again motivates us to engage more with current customers and perspective customers through mobile apps and run campaigns for app downloads. 75 Marketing channel attribution modelling At the same time, social media is under-performing on all major metrics like conversions, revenue and average order value. This way value of each channel and the incremental gains from them can be understood, to further optimise the campaign programs and thereby reducing marketing costs. Table 2 Attributing conversions and revenue Channel name Custom % Conversions % Rev Rev/Order ($) Affiliates 1.81% 1.88% 482.21 Direct 14.74% 15.89% 499.98 Email 0.24% 0.22% 428.52 Display 0.20% 0.20% 455.32 App 0.08% 0.12% 616.87 Internal 8.03% 7.01% 404.51 Natural_Search 37.12% 42.26% 528.22 Paid_Search 18.85% 18.18% 447.35 Partners 3.88% 3.61% 433.03 Referring_Domains 34.84% 30.49% 406.00 Social_Networks_organic 0.17% 0.12% 310.37 Social_Networks_paid 0.04% 0.02% 355.88 8 Conclusions This study proposes a choice of attribution framework based on Markov chain analysis in order to solve the problem of attribution. Higher order models allow new data driven insights into the interplay of channels in multi-channel environment. This paper aligns with the previous studies (Berman, 2018; Li and Kannan, 2014), indicating that last-click attribution measures are neither appropriate to determine the factual contribution of individual channels nor to determine interplay of multi-channels, and highlighting the relevance of Markov attribution model, along with its method of implementation. For online advertising, having the right attribution model is crucial as it drives performance metric and advertising insights. The study illustrates the importance of attribution so that marketers can accurately determine the credit of each channel to the overall conversions. It has important managerial implications for allocating marketing budgets across various marketing channels. 9 Implications One of the most critical challenges that marketers face after creating the attribution model is, the allocation of marketing budget for marketing channels according to the newly driven insights. The answer always is resounding yes. Marketers need to ensure that they are able to monitor ROI (return on investment) of their changed marketing strategies and sufficient data is collected from new strategies to take any meaningful decision. 76 K. Mehta and E. Singhal Attribution analysis is probably one of the most challenging aspects of digital analytics, and it can help marketers find true ROI of each of the channel, efficacy of campaigns, channels and campaigns which are better at acquisition and conversions to create media sequence plan, create campaign calendar, identify the high performing and under-valued channels, incentivise media teams to improve the performance at a holistic level and ascertain value of their strategic campaigns through more tactical data. Though there are many attribution models available as out of box solution within many analytics tools, and different industries have their own way of measuring success, but any data measurement methodology, without the statistical vetting is as good as having no measurement strategy at all. References Agichtein, E., Brill, E., Dumais, S. and Ragno, R. (2006) ‘Learning user interaction models for predicting web search result preferences’, in Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, August, pp.3–10. Ailawadi, K.L. and Farris, P.W. (2017) ‘Managing multi-and omni-channel distribution: metrics and research directions’, Journal of Retailing, Vol. 93, No. 1, pp.120–135. Anderl, E., Becker, I., von Wangenheim, F. and Schumann, J.H. (2016) ‘Mapping the customer journey: Lessons learned from graph-based online attribution modeling’, International Journal of Research in Marketing, Vol. 33, No. 3, pp.457–474. Bell, D.R., Gallino, S. and Moreno, A. (2014) ‘How to win in an omnichannel world’, MIT Sloan Management Review, Vol. 56, No. 1, p.45. Berman, R. (2018) ‘Beyond the last touch: Attribution in online advertising’, Marketing Science, Vol. 37, No. 5, pp.771–792. Chandler-Pepelnjak, J. (2009) Measuring ROI Beyond the Last Ad: Winners and Losers in the Purchase Funnelare Different When Viewed Through a New Lens’, Microsoft Advertising Institute. Dalessandro, B., Perlich, C., Stitelman, O. and Provost, F. (2012) ‘Causally motivated attribution for online advertising’, in Proceedings of the Sixth International Workshop on Data Mining for Online Advertising and Internet Economy, ACM, p.7. Danaher, P.J. and van Heerde, H.J. (2018) ‘Delusion in attribution: caveats in using attribution for multimedia budget allocation’, Journal of Marketing Research, Vol. 55, No. 5, pp.667–685. De Haan, E., Wiesel, T. and Pauwels, K. (2016) The effectiveness of different forms of online advertising for purchase conversion in a multiple-channel attribution framework’, International Journal of Research in Marketing, Vol. 33, No. 3, pp.491–507. Goldfarb, A. and Tucker, C. (2011) ‘Online display advertising. Targeting and obtrusiveness’, Marketing Science, Vol. 30, No. 3, pp.389–404. Jordan, P., Mahdian, M., Vassilvitskii, S. and Vee, E. (2011) ‘The multiple attribution problem in pay-per-conversion advertising’, in International Symposium on Algorithmic Game Theory, Springer, Berlin, Heidelberg, pp. 31-43. Kannan, P.K. (2017) ‘Digital marketing: a framework, review and research agenda’, International Journal of Research in Marketing, Vol. 34, No.1, pp.22–45. Kireyev, P., Pauwels, K. and Gupta, S. (2016) ‘Do display ads influence search? Attribution and dynamics in online advertising’, International Journal of Research in Marketing, Vol. 33, No. 3, pp.475–490. Kitts, B., Wei, L., Au, D., Powter, A. and Burdick, B. (2010) ‘Attribution of conversion events to multi-channel media’, in 2010 IEEE International Conference on Data Mining, pp.881–886. Marketing channel attribution modelling 77 Lee, G. (2010) ‘Death of ‘last click wins’: media attribution and the expanding use of media data’, Journal of Direct, Data and Digital Marketing Practice, Vol. 12, No. 1, pp.16–26. Lewis, R.A., Rao, J.M. and Reiley, D.H. (2011) ‘Here, there, and everywhere: correlated online behaviors can lead to overestimates of the effects of advertising’, in Proceedings of the 20th International Conference on World Wide Web, March, pp.157–166. Li, H. and Kannan, P.K. (2014) ‘Attributing conversions in a multichannel online marketing environment: An empirical model and a field experiment’, Journal of Marketing Research, Vol. 51, No. 1, pp.40–56. Montoya, R., Netzer, O. and Jedidi, K. (2010) ‘Dynamic allocation of pharmaceutical detailing and sampling for long-term profitability’, Marketing Science, Vol. 29, No. 5, pp.909–924. Necsulescu, N. (2015) ‘Multi-channel attribution and its role in marketing investment’, Journal of Digital and Social Media Marketing, Vol. 3, No. 2, pp.125–134. Neslin, S.A. and Shankar, V. (2009) ‘Key issues in multichannel customer management: current knowledge and future directions’, Journal of Interactive Marketing, Vol. 23, No. 1, pp.70–81. Neslin, S.A., Grewal, D., Leghorn, R., Shankar, V., Teerling, M.L., Thomas, J.S. and Verhoef, P.C. (2006) ‘Challenges and opportunities in multichannel customer management’, Journal of Service Research, Vol. 9, No. 2, pp.95–112. Netzer, O., Lattin, J.M. and Srinivasan, V. (2008) ‘A hidden Markov model of customer relationship dynamics’, Marketing Science, Vol. 27, No. 2, pp.185–204. Osur, A., Riley, E., Moffett, T., Glass, S. and Komar, E. (2012) The Forrester Wave Interactive Attribution Vendors Q2 2012, Forrester White Paper. Raman, K., Mantrala, M.K., Sridhar, S. and Tang, Y.E. (2012) ‘Optimal resource allocation with time-varying marketing effectiveness, margins and costs’, Journal of Interactive Marketing, Vol. 26, No. 1, pp.43–52. Rana, S. (2019) ‘Moving in the realm of big data: using analytics in management research and practices’, FIIB Business Review, Vol. 8, No. 1, pp.7–8. Rangaswamy, A. and van Bruggen, G.H. (2005) ‘Opportunities and challenges in multichannel marketing: An introduction to the special issue’, Journal of Interactive Marketing, Vol. 19, No. 2, pp.5–11. Rossi, P.S., Romano, G., Palmieri, F. and Iannello, G. (2003) ‘A hidden Markov model for internet channels’, in Proceedings of the 3rd IEEE International Symposium on Signal Processing and Information Technology (IEEE Cat. No. 03EX795), IEEE, pp.50–53. Schwartz, E.M., Bradlow, E., Fader, P. and Zhang, Y. (2011) ‘Children of the HMM’: Modeling Longitudinal Customer Behavior at Hulu.com, WCAI Working Paper Series. Shao, X. and Li, L. (2011) ‘Data-driven multi-touch attribution models’, in Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, pp.258–264. Sinha, R., Saini, S. and Anadhavelu, N. (2014) ‘Estimating the incremental effects of interactions for marketing attribution’, in 2014 International Conference on Behavioral, Economic, and Socio-Cultural Computing, pp.1–6. Wiesel, T., Pauwels, K. and Arts, J. (2011) ‘Practice prize paper – marketing’s profit impact: Quantifying online and off-line funnel progression’, Marketing Science, Vol. 30, No. 4, pp.604–611. Xu, L., Duan, J.A. and Whinston, A. (2014) ‘Path to purchase: A mutually exciting point process model for online advertising and conversion’, Management Science, Vol. 60, No. 6, pp.1392–1412.