TY - JOUR
T1 - Propensity score oversampling and matching for uplift modeling
AU - Vairetti, Carla
AU - Gennaro, Franco
AU - Maldonado, Sebastián
N1 - Publisher Copyright:
© 2024 Elsevier B.V.
PY - 2024/8/1
Y1 - 2024/8/1
N2 - In this paper, we propose a novel matching strategy to correct for confounding in uplift modeling. Our method, called propensity score oversampling and matching (ProSOM), extends the well-known propensity score matching (PSM) technique by addressing one of its main limitations: dealing with small datasets that face an imbalance in the distribution of the causal variable. Apart from this, we also face the additional complexity of dealing with class labels. The proposed method establishes a parallel between uplift modeling and class-imbalance classification as it extends existing oversampling techniques to create synthetic elements from the treatment group. We design an algorithm that performs classaware data oversampling in the treatment group, and then it matches samples from this group with the control group. This can be seen as a novel hybrid undersampling-oversampling solution for causal learning. Experiments on five datasets show the virtues of ProSOM in terms of predictive performance, achieving the best Qini coefficient for all five datasets in relation to PSM and other resampling solutions.
AB - In this paper, we propose a novel matching strategy to correct for confounding in uplift modeling. Our method, called propensity score oversampling and matching (ProSOM), extends the well-known propensity score matching (PSM) technique by addressing one of its main limitations: dealing with small datasets that face an imbalance in the distribution of the causal variable. Apart from this, we also face the additional complexity of dealing with class labels. The proposed method establishes a parallel between uplift modeling and class-imbalance classification as it extends existing oversampling techniques to create synthetic elements from the treatment group. We design an algorithm that performs classaware data oversampling in the treatment group, and then it matches samples from this group with the control group. This can be seen as a novel hybrid undersampling-oversampling solution for causal learning. Experiments on five datasets show the virtues of ProSOM in terms of predictive performance, achieving the best Qini coefficient for all five datasets in relation to PSM and other resampling solutions.
KW - Analytics
KW - Data resampling
KW - Oversampling
KW - Propensity score matching
KW - Uplift modeling
UR - http://www.scopus.com/inward/record.url?scp=85188427240&partnerID=8YFLogxK
U2 - 10.1016/j.ejor.2024.03.024
DO - 10.1016/j.ejor.2024.03.024
M3 - Article
AN - SCOPUS:85188427240
SN - 0377-2217
VL - 316
SP - 1
EP - 12
JO - European Journal of Operational Research
JF - European Journal of Operational Research
IS - 3
ER -