Uplift modelling

Question

I have been trying to build a uplift model which gives incremental probability of a customer responding to a treatment. I am thinking of using pylift library for my model, I had few questions regarding the same 1. Does the library provide any function to return the incremental probabilities? 2. I don't have experimental data, I am trying to build the model on call center data, hence I have defined the test group as the customers who received a call from call center for a particular time interval, and the customers who were not called is my control group for the same time period. Does this approach make sense? 3. Is there any other library to build such model?

Robert Yi · Accepted Answer · 2019-05-09 00:25:18Z

Does the library provide any function to return the incremental probabilities?

If you mean the predicted incremental values, then yes. The predicted lift values are contained in the class object: up.transformed_y_test_pred for the test set, and up.transformed_y_train_pred for the training set (this name really should be changed to something more intuitive).

I don't have experimental data, I am trying to build the model on call center data, hence I have defined the test group as the customers who received a call from call center for a particular time interval, and the customers who were not called is my control group for the same time period. Does this approach make sense?

This depends, but it's generally pretty dangerous to use observational data. You could be introducing heavy bias into your data -- e.g. imagine if happy people are more likely to pick up when you call than sad people. You look at the different conversion rates between happy and sad people, and you find that it's very high. But you find then when you run a randomized experiment, the lift disappears. In this contrived example, happy people are simply more likely to convert than sad people, and this is the effect that you've discovered. In the jargon, the problem here is that unconfoundedness is not satisfied.

However, if you could somehow remove this bias, by say, predicting $P(treatment|X)$ perfectly, then you could use this propensity score to reweight all your observations. Once you make this model, you can add it to your dataframe, then add the col_policy kwarg to your initiation of TransformedOutcome in pylift. Everything should subsequently do this reweighting for you automatically and correctly.

But I'd still say that this is dangerous...

Is there any other library to build such model?

Absolutely. There are a couple of other notable ones, but they are both in R.

Susan Athey's causalTree package
Leo Guelman's uplift R package (library(uplift))

Hey Robert, I have not used the pylift library, I have built a model with an intervention flag in my data which helps segregate the test and control population and then calculated incremental probability. However, I am facing the exact same issue that you mentioned with using observational data. How can I identify and remove the biases caused due to using observational data. Any input will be helpful. — Arindam Bose, Commented May 9, 2019 at 9:34
Update: Athey, Tibshirani and Wager have a new R package {grf} that implements causal_forest as a more efficient (and theoretically sound) version of the Guelman's uplift forest. — Johannes, Commented Oct 3, 2019 at 19:48

Stack Exchange Network

Uplift modelling

1 Answer 1

Your Answer

Not the answer you're looking for? Browse other questions tagged
machine-learning
propensity-scores
or ask your own question.

Hot Network Questions

Uplift modelling

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged machine-learningpropensity-scores or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
machine-learning
propensity-scores
or ask your own question.