The probability of receiving treatment, also known as the propensity score, plays a very special role in the estimation of treatment effects. In this post we will consider how to estimate the propensity score. In subsequent posts we will look at how it can be used to enrich our causal analysis.
Recall unconfoundedness assumes that conditional on the covariates, potential outcomes are independent of treatment assignment:
\[ (Y(0), Y(1)) \perp D \; | \; X \]
Previously we saw how we could derive an excellent treatment effect estimator using this assumption by matching individuals based on their covariate values. It turns out that, as proven in Rosenbaum and Rubin's seminal paper in 1983, the above condition implies that the following is also true:
\[ (Y(0), Y(1)) \perp D \; | \; p(X) \]
In other words, conditional on the propensity score \(p(X) = \mathrm{P}(D = 1 | X)\), treatment assignment is essentially as good as random. This means that for subjects that share the same propensity score (even if their covariate vectors are different), the difference between the treated and the control units actually identifies a conditional average treatment effect, namely \(\mathrm{E}[Y(1) - Y(0) | p(X)]\). Thus instead of matching on the covariate vectors \(X\) themselves, we can match on the single-dimensional propensity score \(p(X)\), aggregate across subjects, and still arrive at a valid estimate of the overall average treatment effect.
Indeed, as we shall see, the propensity score is useful in other ways beyond providing yet another estimator. It can also be used for assessing and improving covariate balance, for example, among other things.
To estimate the propensity score, note that since it represents nothing other than the probability of receiving treatment conditional on the covariates, it can be estimated based on data on the observable variables \(D\) and \(X\). As the functional form of \(\mathrm{P}(D = 1 | X)\) is usually unknown, Hirano, Imbens, and Ridder (2003) suggest estimating it using a flexibly-specified logistic regression.
In Causalinference, this can be done by using either one of the methods est_propensity
or est_propensity_s
. The former allows the user to specify which covariates to include linearly and/or quadratically, while the latter will make this choice automatically based on a sequence of likelihood ratio (LR) tests.
More specifically, Imbens and Rubin (2015) recommend the following algorithm for variable selection in the estimation of the propensity score. At a high level, the steps are:
This procedure is not guaranteed to select the best functional form for \(\mathrm{P}(D = 1 | X)\), but it should nonetheless result in a sensible specification that groups subjects with similar covariate values together through a single-dimensional score.
To perform the above propensity score estimation procedure in Causalinference and display the logistic regression results, simply go
>>> causal.est_propensity_s()
>>> print(causal.propensity)
Estimated Parameters of Propensity Score
Coef. S.e. z P>|z| [95% Conf. int.]
--------------------------------------------------------------------------------
Intercept -2.839 0.526 -5.401 0.000 -3.870 -1.809
X1 0.486 0.153 3.178 0.001 0.186 0.786
X0 0.466 0.155 3.011 0.003 0.163 0.770
X1*X0 0.080 0.015 5.391 0.000 0.051 0.109
X0*X0 -0.045 0.012 -3.579 0.000 -0.069 -0.020
X1*X1 -0.045 0.013 -3.542 0.000 -0.070 -0.020
The table above shows the standard results that would usually be reported when a logistic regression is run. To access directly these outputs, as well as other computed values like the estimated propensity scores, we can inspect the dictionary-like attribute propensity
:
>>> causal.propensity.keys()
['coef', 'lin', 'qua', 'loglike', 'fitted', 'se']
Of note are propensity['lin']
and propensity['qua']
, which contain, respectively, the linear and quadratic terms selected by the algorithm, and propensity['fitted']
, which contains the estimated propensity score of each subject.
It is also possible to customize est_propensity_s
by deliberately specifying the basic covariates \(X_B\) and the variable inclusion decision thresholds \(C_{lin}\) and \(C_{qua}\). Details on how this can be done can be found in the documentation for the method.
Hirano, K., Imbens, G., & Ridder, G. (2003). Efficient estimation of average treatment effects using the estimated propensity score. Econometrica, 71, 1161โ1189.
Imbens, G. & Rubin, D. (2015). Causal Inference in Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge University Press.
Rosenbaum, P. & Rubin, D. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70, 41โ55.