Churn Modeling
Churn Modeling
Churn Modeling
Modeling churn behavior of bank customers using predictive data mining techniques
M Purna Chandar1,2, Arijit Laha1 , P Radha Krishna1
1
Institute for Development and Research in Banking Technology(IDRBT) Castle Hills, Hyderabad-500057 2 Dept. of Computer and Information Sciences, University of Hyderabad, Gachibowli, Hyderabad 500046 mpchandar@mtech.idrbt.ac.in {alaha,prkrishna}@idrbt.ac.in
Abstract Acquiring new customers is a more costlier process than retaining existing customers. Therefore, the management of relationship with customers plays a vital role in improving the overall profitability of a company. There are many segments that come under the Customer Relationship Management (CRM) umbrella such as churn prediction, target marketing, cross/up selling, customer profiling, etc. Churn is defined as the propensity of a customer to cease doing business with a company in a given time period. In this paper, we emphasize on modeling churn behavior of bank customers in Indian scenario. Ideally, many characteristics of customers like demographic details, psychographic (transactional details), product purchase details, customer perception details are vital in modeling the churn behavior of bank customers. But, in real-time, especially in Indian banking scenario, this required data is not captured by the banks. Hence, we have to live with the available data and have to extract the required features from it. In this paper, we give a detailed guideline to convert raw customer data into meaningful data that suits modeling churn behavior. Then, to convert this meaningful data into knowledge, predictive data mining techniques are used. We built models with three decision tree algorithms namely CART, TreeNet and C5.0. CART yielded better classification rate in predicting the churn behavior and C5.0 yielded better classification rate in predicting the non-churn (active) behavior of customers. As predicting churn is more important for a bank, it can be said that CART yielded a better overall classification rate. CART generated seventeen decision rules which can be useful in predicting probable churners. Index Terms Banking, Predictive Data Mining, Churn Prediction, Classification models.
Introduction
High cost of customer acquisition and customer education requires companies to make large upfront investments on customers. However, due to easy access to information and a wide range
Introducing Predictive Data Mining Data mining is the process of exploration and analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful patterns and rules [5]. It can also be defined as the process of selecting, exploring and modeling large amounts of data to uncover previously unknown data patterns for business advantage [6]. Using data mining, descriptive models and predictive models can be built. Descriptive models are built on the concept of unsupervised learning and predictive models are built on the concept of supervised learning. Here, only predictive data mining models are discussed. In a predictive model, one of the variables (the target variable or response variable) is expressed as a function of the other variables. This permits the value of the response variable to be predicted from given values of the other variables (the explanatory or predictor variables). In the churn prediction problem, the response variable, i.e., the future status of the customers, can take only two values viz. Active or
Customer A churned
Customer A churned
Customer B churned
Next in this section, we explain the real life data we worked with, and filtration steps (data cleansing step) for prepare an efficient dataset. We obtained the customer data from a nationalized Indian bank. The details of the data obtained are shown in Table 1.
TABLE 1 DETAILS OF CUSTOMER DATA Table Name Customer General Ledger Dormant Master Txn Ttype Attributes Custno, Name1, Name2, Address, Status, DoB (Date of Birth), Edn Custno, AcNo, Descr, DOP (Date of a/c opening) Acno, Descr, Dormant Acno, Balance, Dormant flag. Acno, Trntype, Date, Amount Ttype#, Typecode, Descr No. of Records 40,880 1,08,029 18,002 31,022 31,39,020 93
The success rate of active customers is relatively less. This is because, some of the customers, although their status is marked as active, exhibited churn characteristics. It is this segment of customers on which bank has to concentrate upon and apply churn prevention methodologies. There are 17 leaf nodes in the tree generated using CART and hence 17 decision rules can be drawn from it. Table 4 shows the 17 decision rules.
TABLE 4 DECISION RULES GENERATED BY CART Rule # 1 2 3 4 5 6 Rule AvgDrAmt <= 608 and AvgCrAmt <= 37.5 AvgDrAmt <= 608 and AvgCrAmt > 37.5 and AvgCrAmt <= 1655.5 and Duration > 18 AvgCrAmt <= 1655.5 and AvgDrAmt > 608 AvgCrAmt > 1655.5 and Duration <= 23.5 and AvgDrAmt <= 1300.5 AvgCrAmt > 1655.5 and Duration <= 23.5 and AvgDrAmt > 1300.5 and PercClosedRecently > 0.0416667 AvgCrAmt > 1655.5 and Duration > 23.5 and Duration <= 27.5 Predicted Class Churn Active Churn Active Churn Active # Cases 93 61 102 20 26 615
13
Churn
10
14 15 16 17
4 10 10 10
Out of the 17 rules generated by CART, 12 rules have sufficient number of cases and these rules can be used as rules of thumb by the manager for predicting probable future churn customers.
Classification Tree Model using TreeNet TreeNet uses stochastic gradient boosting algorithm as a technique for improving the accuracy of a predictive function by applying the function repeatedly in a series and combining the output of each function with weighting so that the total error of the prediction is minimized. In many cases, the predictive accuracy of such a series greatly exceeds the accuracy of the base function used alone. The stochastic gradient boosting algorithm used by TreeNet is optimized for improving the accuracy of models built on decision trees. Number of trees formed by TreeNet was set to 200 and the regression loss criterion used was Huber-M. The confusion matrix and classification success rate of training dataset and testing dataset are shown in table 5 and table 6 respectively.
3.1.
C5.0 is another classification algorithm used to generate decision trees. Unlike CART, this algorithm produces trees with variable branches per node. Customer Duration, CRTxns, DRTxns, AvgCrAmt, AvgDrAmt, PercClosedRecently are used as explanatory variables and Status is taken as target variable. 80% of the dataset i.e., 1,192 samples containing 929 active customer records and 263 churned customer records, are taken in training dataset. The remaining 20% of the dataset i.e., 299 samples containing 244 active customer records and 58 churned customer records, are taken in testing dataset. The confusion matrix and classification success rate of training dataset and testing dataset are shown in table 7 and table 8 respectively.
TABLE 7 CONFUSION MATRIX AND PREDICTION SUCCESS RATE FOR TRAINING DATA True Total # Predicted Predicted Success Class samples Active Churn % Active 929 883 46 95.04 Churn 263 61 182 69.2 TABLE 8 CONFUSION MATRIX AND PREDICTION SUCCESS RATE FOR TEST DATA True Total # Predicted Predicted Success Class samples Active Churn % Active 244 234 10 95.9 Churn 58 18 40 68.9
Discussion Of Results
In this study, we have experimented with 3 classification techniques namely CART, TreeNet and C5.0. The prediction success rate of Churn class by CART and TreeNet are quite high but C5.0 had shown poor results in predicting churn customers. However, the prediction success rate of Active class by C5.0 is more than the other two techniques. But for reaping significant benefits, the model should be able to predict the churn behavior better. Hence, a model with higher prediction success rate of Churn class has to be chosen for reaping higher benefits. In all the decision tree models, all the explanatory attributes were found to be influencing the target variable, i.e., status of the customer.
Conclusions
References
[1]. J.Dyche: The CRM handbook: a business guide to customer relationship man`agement. Reading, MA: Addison-Wesley, 2001. [2]. Reichheld,F. F., & Sasser, W.E., Jr., (1990) Zero defections: quality comes to service, Harvard Business Review, 68(5), 105 - 111. [3]. Van den Poel, D., & Lariviere, B., Customer attrition analysis for financial services using proportional hazard models, European Journal of Operational Research, 157 (2004), 196 217. [4]. Athanassopoulos, A.D., 2000, Customer satisfaction cues to support market segmentation and explain switching behavior, Journal of Business Research, 47 (3), 191 207. [5]. Berry, M.J.A. and Linoff, G. (2000), Mastering Data Mining: The Art and Science of Customer Relationship Management, Wiley Computer Publishing, New York, NY. [6]. SAS Institute (2000), ``Best practice in churn prediction, A SAS Institute White Paper. [7]. David Hand, Heikki Mannila, Padhraic Smyth. (2001). Principles of Data Mining.
[13].