Propensity score oversampling and matching for uplift modeling

Carla Vairetti*, Franco Gennaro, Sebastián Maldonado

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

In this paper, we propose a novel matching strategy to correct for confounding in uplift modeling. Our method, called propensity score oversampling and matching (ProSOM), extends the well-known propensity score matching (PSM) technique by addressing one of its main limitations: dealing with small datasets that face an imbalance in the distribution of the causal variable. Apart from this, we also face the additional complexity of dealing with class labels. The proposed method establishes a parallel between uplift modeling and class-imbalance classification as it extends existing oversampling techniques to create synthetic elements from the treatment group. We design an algorithm that performs classaware data oversampling in the treatment group, and then it matches samples from this group with the control group. This can be seen as a novel hybrid undersampling-oversampling solution for causal learning. Experiments on five datasets show the virtues of ProSOM in terms of predictive performance, achieving the best Qini coefficient for all five datasets in relation to PSM and other resampling solutions.

Original languageEnglish
JournalEuropean Journal of Operational Research
DOIs
StateAccepted/In press - 2024

Bibliographical note

Publisher Copyright:
© 2024 Elsevier B.V.

Keywords

  • Analytics
  • Data resampling
  • Oversampling
  • Propensity score matching
  • Uplift modeling

Fingerprint

Dive into the research topics of 'Propensity score oversampling and matching for uplift modeling'. Together they form a unique fingerprint.

Cite this