Propensity score matching difference-in-difference via Stuart et al. 2014 methodology

Emil Alnor

Join Date: Jun 2021

Posts: 130
#1

Propensity score matching difference-in-difference via Stuart et al. 2014 methodology

19 Aug 2022, 04:00

Hi all,

I am trying to evaluate the effect of a university teaching reform on the wage outcome of its students using PSM-DiD via the methodology proposed by Stuart et al. 2014 - Using propensity scores in difference-in-differences models to estimate the effects of a policy change. (https://pubmed.ncbi.nlm.nih.gov/25530705/)

Considering four groups:
1 - Treated in pre-treatment period
2 - Control in pre-tratment period
3 - Treated in post-treatment period
4 - Control in post-treratment period

...they suggest to "fit a multinomial logistic regression predicting Group as a function of a set of observed covariates X. Each individual will have four resulting propensity scores, e_k(X_i): the probability of being in Group k, for k = 1–4. (Note that these four will sum to one for each individual). The weights are then created in such a way that each of the four groups is weighted to be similar to Group 1, the treatment group in the pre period. This is accomplished using the following weight for individual i:

w_i= e₁ / e_g (X_i)

where g refers to the group that individual i was actually in."

I try to do this by writing

Code:

forval x = 1/4{ mlogit group "list of covariates", baseoutcome(`x') predict psa`x'

My problem is that psa1=psa2=psa3=psa4!=1, where I instead want psa`x' to be the probability of the individual to be in group x. Apparently, I can solve it by instead estimating four "single"logistic regression for each of the four groups, but I am not sure if that is econometrically correct, since Stuart et al. suggest to estimate the probability via multinomial logistic regression.

I cannot show you my real data, as it is confidential, but using the example hospital data from the helpfile to 'didregress' this is what I try:

Code:

use https://www.stata-press.com/data/r17/hospdd, clear gen group= 1 if inrange(month, 1, 3) & inrange(hospital, 1, 18) replace group=2 if inrange(month, 1, 3) & inrange(hospital, 19, 46) replace group=3 if inrange(month, 4, 7) & inrange(hospital, 1, 18) replace group=4 if inrange(month, 4, 7) & inrange(hospital, 19, 46) label define group 1 "pre-treated" 2 "pre-control" 3 "post-treated" 4 "post-control" label values group group *multiple logit forval x = 1/4{ mlogit group i.frequency, baseoutcome(`x') predict psa`x' } gen psa = psa1+psa2+psa3+psa4 tab psa *single logit gen pretreated=0 replace pretreated=1 if group==1 gen precontrol=0 replace precontrol=1 if group==2 gen posttreated=0 replace posttreated=1 if group==3 gen postcontrol=0 replace postcontrol=1 if group==4 logit pretreated i.frequency predict psb1 logit precontrol i.frequency predict psb2 logit posttreated i.frequency predict psb3 logit postcontrol i.frequency predict psb4 gen psb = psb1+psb2+psb3+psb4 tab psb

So my question is how I can correctly estimate the probability of being in each of the four groups defined above using multinomial logistic regression?

I am aware that there is a user written package 'diff' which accomplishes PSM-DiD but this comes with the restriction of only being allowed to match via kernel-matching, which I am not interested in.
Tags: mlogit, PSD-DiD
Fei Wang

Join Date: Oct 2021

Posts: 726
#2

19 Aug 2022, 04:28

According to the description in the paper, the weight variable is computed as below.

Code:

mlogit group i.frequency predict p* gen weight = . forvalues i = 1/4 { replace weight = p1/p`i' if group == `i' }

Then the DiD estimation can be implemented using a weighted linear regression.
1 like
Comment
Emil Alnor

Join Date: Jun 2021

Posts: 130
#3

19 Aug 2022, 04:46

Originally posted by Fei Wang View Post

According to the description in the paper, the weight variable is computed as below.

Code:

mlogit group i.frequency predict p* gen weight = . forvalues i = 1/4 { replace weight = p1/p`i' if group == `i' }

Then the DiD estimation can be implemented using a weighted linear regression.

Thanks for your concise answer, which solved my problem.
Comment
Kehinde Atoloye

Join Date: Nov 2021

Posts: 119
#4

18 Apr 2023, 09:02

Originally posted by Fei Wang View Post

According to the description in the paper, the weight variable is computed as below.

Code:

mlogit group i.frequency predict p* gen weight = . forvalues i = 1/4 { replace weight = p1/p`i' if group == `i' }

Then the DiD estimation can be implemented using a weighted linear regression.

Please, I am also interested in this method but can I use the weight variable generated with the didregress command. I want to understand if using it will not create any violation or over-fitting considering didregress DiD approach. Thanks.
Comment

Announcement

Propensity score matching difference-in-difference via Stuart et al. 2014 methodology

Comment

Comment

Comment