Hello,
I'm using a difference-in-difference (DID) model weighted with propensity scores to estimate the impact of a treatment on student test scores.
The data are for the same schools in both 2012 (baseline, pre-treatment) and 2016 (endline, post-treatment). But the students in each of the rounds are different and there mat be significant differences in composition between treatment and control groups. So I calculate propensity score weights as follows using psmatch2, for the baseline and endline data separately:
I then use these weights in the DID model as follows
Question 1: Is the above calculation of weights correct? i.e., using the inverse of the propensity scores to calculate the weight and inserting it as above in the DID model.
Question 2: The bigger problem is that there are actually 4 groups: treatment pre, treatment post, control pre, control post. So shouldn't the propensity scores be estimated for each student relative to one of the 4 groups such as treatment pre? How do I do this on Stata? Is it as below?
* References: E. Leuven and B. Sianesi. (2003). "PSMATCH2: Stata module to perform full
Mahalanobis and propensity score matching, common support graphing, and covariate
imbalance testing". http://ideas.repec.org/c/boc/bocode/s432001.html.
The data used is:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float(school_id year treatment stud_id) double nobreakfast float(y2016 stdmath inc)
18 2012 1 5 0 0 -.20173773 6.584059
18 2012 1 11 0 0 -.7822701 6.974636
18 2012 1 19 0 0 1.3076463 7.804821
18 2016 1 38 0 1 -.8983765 6.926063
18 2016 1 39 0 1 -1.0725362 7.141726
19 2012 1 49 0 0 2.004285 7.547534
19 2012 1 50 0 0 .6690608 7.30482
19 2016 1 59 1 1 .9593269 6.634033
19 2016 1 66 0 1 1.3656995 6.412418
19 2016 1 67 0 1 1.3656995 9.253709
19 2016 1 68 1 1 .3207414 5.673886
19 2016 1 69 1 1 -.02757803 7.625872
19 2016 1 70 0 1 -.7242168 7.333391
19 2016 1 71 0 1 -.4339507 7.54091
19 2016 1 72 0 1 .14658166 9.662361
19 2016 1 73 1 1 .3787946 6.246556
19 2016 1 74 1 1 -1.014483 7.201874
19 2016 1 75 0 1 .26268813 7.833391
19 2016 1 76 1 1 -.3178442 5.33342
20 2012 1 77 1 0 -1.4208556 5.031689
20 2012 1 78 0 0 -.14368449 5.297753
20 2012 1 79 0 0 -.3178442 7.30482
20 2012 1 83 0 0 -.7242168 6.292029
20 2016 1 99 0 1 .727114 5.673886
20 2016 1 100 1 1 -.14368449 7.040909
end
[/CODE]
I'm using a difference-in-difference (DID) model weighted with propensity scores to estimate the impact of a treatment on student test scores.
The data are for the same schools in both 2012 (baseline, pre-treatment) and 2016 (endline, post-treatment). But the students in each of the rounds are different and there mat be significant differences in composition between treatment and control groups. So I calculate propensity score weights as follows using psmatch2, for the baseline and endline data separately:
Code:
psmatch2 treatment inc nobreakfast if year==2012 rename _pscore pscore2012 psmatch2 treatment inc nobreakfast if year==2016 * create a weight that is the inverse (??) of pscore gen ps_weight=1/pscore2012 if year==2012 replace ps_weight=1/_pscore if year==2016
Code:
xtset school_id xtreg stdmath i.y2016##i.treatment inc nobreakfast ps_weight, fe vce(robust)
Question 2: The bigger problem is that there are actually 4 groups: treatment pre, treatment post, control pre, control post. So shouldn't the propensity scores be estimated for each student relative to one of the 4 groups such as treatment pre? How do I do this on Stata? Is it as below?
Code:
gen group= 1 if year==2012 & treatment==1 replace group= 2 if year==2012 & treatment==0 replace group= 3 if year==2016 & treatment==1 replace group= 4 if year==2016 & treatment==0 mlogit group inc nobreakfast y2016 predict ps gen ps_weight2=1/ps
* References: E. Leuven and B. Sianesi. (2003). "PSMATCH2: Stata module to perform full
Mahalanobis and propensity score matching, common support graphing, and covariate
imbalance testing". http://ideas.repec.org/c/boc/bocode/s432001.html.
The data used is:
* Example generated by -dataex-. To install: ssc install dataex
clear
input float(school_id year treatment stud_id) double nobreakfast float(y2016 stdmath inc)
18 2012 1 5 0 0 -.20173773 6.584059
18 2012 1 11 0 0 -.7822701 6.974636
18 2012 1 19 0 0 1.3076463 7.804821
18 2016 1 38 0 1 -.8983765 6.926063
18 2016 1 39 0 1 -1.0725362 7.141726
19 2012 1 49 0 0 2.004285 7.547534
19 2012 1 50 0 0 .6690608 7.30482
19 2016 1 59 1 1 .9593269 6.634033
19 2016 1 66 0 1 1.3656995 6.412418
19 2016 1 67 0 1 1.3656995 9.253709
19 2016 1 68 1 1 .3207414 5.673886
19 2016 1 69 1 1 -.02757803 7.625872
19 2016 1 70 0 1 -.7242168 7.333391
19 2016 1 71 0 1 -.4339507 7.54091
19 2016 1 72 0 1 .14658166 9.662361
19 2016 1 73 1 1 .3787946 6.246556
19 2016 1 74 1 1 -1.014483 7.201874
19 2016 1 75 0 1 .26268813 7.833391
19 2016 1 76 1 1 -.3178442 5.33342
20 2012 1 77 1 0 -1.4208556 5.031689
20 2012 1 78 0 0 -.14368449 5.297753
20 2012 1 79 0 0 -.3178442 7.30482
20 2012 1 83 0 0 -.7242168 6.292029
20 2016 1 99 0 1 .727114 5.673886
20 2016 1 100 1 1 -.14368449 7.040909
end
[/CODE]
Comment