0
$\begingroup$

I am trying to build a propensity to pay model given an intervention to a customer.

Context:

  • The population I am dealing with are customers who were supposed to pay some amount on a certain date but have not paid.
  • such customers are contacted via Call centres to remind them of the payment to be made
  • some customers pay, some don't Problem statement: Build a propensity to pay scores for these customers. My current approach:
  • data: calls made via call centre on a certain month
  • if a customer has made a payment within 6 days of intervention, tag them as 1, else 0
  • considered few demographic features as well as few operational metrics those may be correlated to a customer Making a payment
  • build a classification model (maybe logistic regression) to get the propensity scores.

Questions:

  • does the approach mentioned make sense
  • what is the need of propensity scores matching
  • the data is not experimental, its observational, can I use the target variable with tag 1, mentioned earlier as a test group and tag 0 as the control group.

Any input on this will be very helpful. Thanks in advance!

$\endgroup$
1
  • $\begingroup$ Logistic regression is not a classification method. It is a direct probability estimation method. See htttp://fharrell.com/post/classification. And since you know the day of payment, use it in the analysis, not some arbitrary dichotomization of time. A Cox proportional hazards model, censoring times on those not yet paid as of the last day known not to have paid, would work. $\endgroup$ Commented Dec 10, 2018 at 13:12

1 Answer 1

0
$\begingroup$

"Customer paid within X days" is presumably your binary response variable; call it R.

However, what you want is a "propensity to pay score"; call it S; which corresponds to the probability that R = 1.

So one formulation is, for every customer C, S = P(R=1|C). I.e. your propensity score is just the probability that the customer pays within X days. The problem then becomes, how to estimate P(R=1|C).

There are lots of confounding factors. Are the relationships between your factors of interest and your response variable linear? Is willingness to pay between customers correlated? Etc.

Since you have a binary response variable and probably mixed numeric/categoric data, I'd suggest starting out with XGBoost. It's easy to use, well documented and state of the art. You could also try GLM (e.g. logit or probit regression) but it'll almost always perform worse.

$\endgroup$
5
  • $\begingroup$ Thanks for the input, this is very helpful. Apart from that, if you could also help me with choosing the value of X days, how can I choose the value of X, as there are customers who has paid 45 days after the intervention has happened. $\endgroup$ Commented Dec 11, 2018 at 3:15
  • $\begingroup$ I suggest picking X such that 90% of customers who pay within 1 year do so within X days. $\endgroup$
    – usgroup
    Commented Dec 11, 2018 at 7:47
  • $\begingroup$ As I have taken a cross sectional data for building the model, I checked the distribution of the days to payment after call for those who has made a payment, as it was right skewed distribution, I used bootstrapping to calculate the 97% confidence interval and took 6 days as after the bootstrapping the data was having a mean of 6. Does that make sense? $\endgroup$ Commented Dec 11, 2018 at 8:22
  • $\begingroup$ Not really. The mean won't help you. You want the number of days by which time most people would have paid. I.e. X such that 90% of those who pay do so in X days or less. $\endgroup$
    – usgroup
    Commented Dec 11, 2018 at 11:46
  • $\begingroup$ Got it. Another thing, I was doing a little bit of research on propensity modelling, I found that the propensity scores are generally generated on randomized experimental data. And in case of observational data, which generally is not randomized, we get propensity scores by fitting models and then we match data to make a statement on the causal effect. I didn't understand the concept of matching completely in my context. It would be really helpful if you can put some light on this. $\endgroup$ Commented Dec 11, 2018 at 13:33

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.