PSM with panel data uisng --psmatch2--, a problem.

Zhang_Lu

Join Date: Oct 2014
Posts: 155

PSM with panel data uisng --psmatch2--, a problem.

21 May 2016, 19:06

Hey, My recent project use a "PSM+DID" empirical design and my dataset is longitudinal. The panel data structure give me some strength in the empirical identification, but also cause some trouble in my data management practice,especially, how to implement PSM on panel data correctly. A template for me is Heyman et al. (JIE,2007), in that paper, they implement a year-by-year psm on "whether a firm is foregin obtained". I rely on the popular user-written command --psmatch2--. Following is a snippet of my code

Code:

  **group by years
  egen g = group(year)
  levels g, local(gr)
  * Note that in each loop, psmatch2 replace its _variables (_treated _weight)
  * So it's necessary to record them in a NEW variable
   foreach j of local gr {
     cap noi psmatch2 bigchangetag $x $high_order $xv if g==`j', n(1) logit qui common noreplacement
     ** Collect the treated year
     by nfid (treatment),sort: gen temp = (_treated==1)
     by nfid: egen num_treated = total(temp)
     by nfid (temp)     ,sort: replace treat_year1 = year[_N] if treat_year1==.&num_treated==1
     drop temp num_treated
     ** Collect the (untreated) match year
     by nfid (treatment),sort: gen temp = (_treated==0)
     by nfid: egen num_treated = total(temp)
     by nfid (temp)     ,sort: replace treat_year2 = year[_N] if treat_year2==.&num_treated==1
     drop temp num_treated

    
     replace treatment     =_treated  if treatment==.
     replace pairs         =_id       if pairs==.
     replace matched       =_weight  if matched==.
     tab year _treated if matched==1
 }

What I want is to obtain the _treated (indicating treatment and control group), _weight(indicating whether the obs is used for match) and obtain the year when the treatment happened. The tricky issue is , in each loop, psmatch2 "refresh" these _variables (_treated _weight), so it's necessary to record them in a NEW variable, that's exactly what I did.
what worries me is , after the code was executed and sent me a series treatment variables, namely, treatment_1 - treatment_6. For I have specified the noreplacement option, it's more likely that each panel units (in my case, nfid) are used only once, occationally towice. However, the generated matched sample is like

Code:

 nfid    year  treatment  pairs(_id) treat_year
161     2000    0        1036589    1999
161     2001    0        1029618    1999
161     2002    0        1050054    1999
161     2003    .        1010596    1999
164     1998    .        1695000    1999
164     1999    0          80890    1999
164     2000    0         879366    1999
164     2001    0         781947    1999
164     2002    0         785361    1999
164     2003    .         957154    1999
169     2003    .        1681113    2004
169     2004    0        1053593    2004
171     1998    .        1548697    2000
171     1999    0         102531    2000
171     2000    1         952717    2000
171     2001    0         889980    2000
171     2002    0         848699    2000
171     2003    .         882134    2000
171     2004    0         552188    2000
173     1998    .        1674613    1999
173     1999    0          79626    1999
176     1998    .        1995491    1999
176     1999    0          40486    1999
176     2000    0         405328    1999
179     1998    .        1963984    1999
179     1999    0          55616    1999
179     2000    0         515677    1999
179     2001    0         622819    1999
179     2002    0         610122    1999
179     2003    .         597080    1999
179     2004    0         446528    1999
188     1998    .        1900139    1999
188     1999    1         117730    1999
188     2000    0         916699    1999
188     2001    0        1148184    1999
188     2002    0        1174944    1999
188     2003    .        1321465    1999
193     1998    .        1544959    1999
193     1999    1         117069    1999
193     2000    0         669927    1999
193     2001    0         907363    1999
193     2002    0         842696    1999
193     2003    .         730017    1999

The results worries me since many units are used as matched sample in "EVERY YEAR". It's interesting because they should be not. I was supposed to obtain something like

Code:

nfid    year  treatment  pairs(_id) treat_year

188     1998    .        1900139    1999
188     1999    1         117730    1999
188     2000    .         916699     1999
188     2001    .        1148184    1999
188     2002    .        1174944    1999

I don't know if there is something wrong in my code. So please help me to check my code and figure out what's going on. Thank you

Tags: None

Zhang_Lu

Join Date: Oct 2014

Posts: 155
#2

21 May 2016, 20:44

I think I need to make my point more clearly, How can I make the --psmatch2-- use the treated sample ONLY ONCE? I mean if a panel unit contain is treated in one period, then it SHALL NOT be selected as control group for other treated sample in other period. I believe it's more reasonable in panel data PSM (Am I right?) Then how to achieve it ?

Last edited by Zhang_Lu; 21 May 2016, 20:56.
Comment
Sebastian Geiger

Join Date: Oct 2015

Posts: 124
#3

22 May 2016, 08:56

This may oversimplify things, but the first idea I had was to identify the previously treated observations with an additional dummy, and then just use the -if- option of psmatch2 like "if prev_treated!=1". In this case, previously treated individuals are excluded from the sample, and thus will not be used as control observation.

For the effects of restrictions on the sample you may consider reading this paper: http://doku.iab.de/discussionpapers/2008/dp1208.pdf
Comment
Zhang_Lu

Join Date: Oct 2014

Posts: 155
#4

22 May 2016, 17:25

Originally posted by Sebastian Geiger View Post

This may oversimplify things, but the first idea I had was to identify the previously treated observations with an additional dummy, and then just use the -if- option of psmatch2 like "if prev_treated!=1". In this case, previously treated individuals are excluded from the sample, and thus will not be used as control observation.

For the effects of restrictions on the sample you may consider reading this paper: http://doku.iab.de/discussionpapers/2008/dp1208.pdf

Thanks，the suggestion and literaure your provide is valuable
Comment
Zhang_Lu

Join Date: Oct 2014

Posts: 155
#5

22 May 2016, 20:58

Originally posted by Sebastian Geiger View Post

This may oversimplify things, but the first idea I had was to identify the previously treated observations with an additional dummy, and then just use the -if- option of psmatch2 like "if prev_treated!=1". In this case, previously treated individuals are excluded from the sample, and thus will not be used as control observation.

For the effects of restrictions on the sample you may consider reading this paper: http://doku.iab.de/discussionpapers/2008/dp1208.pdf

The problem is how to implement this idea with --psmatch2-- ? Say , I can define an dummy indicating whether a firm has ever been treated in the whole observation period, and adjust my code like

Code:

**group by years egen g = group(year) levels g, local(gr) * Note that in each loop, psmatch2 replace its _variables (_treated _weight) * So it's necessary to record them in a NEW variable foreach j of local gr { cap noi psmatch2 bigchangetag $x $high_order $xv if g==`j'&dummy~=., n(1) logit qui common noreplacement }

Then the psmatch will not choose those firms with at least one treatment neither as treatment group nor control group. But I actually want them be treatment group , but not control group, so you can see the difference. I still don't figure out how to choose treatment and control group separately, if it's possible with --psmatch2--

Last edited by Zhang_Lu; 22 May 2016, 21:02.
Comment
Sebastian Geiger

Join Date: Oct 2015

Posts: 124
#6

23 May 2016, 07:51

I don't know the exact structure of your dataset, therefore I'm a little hesitant to provide you a concrete code to implement this idea. Nevertheless, I will try it, but you should check it for any mistakes. One way to implement this idea would be the modify the dummy variable which shows if the individual has been treated before (!) within the -foreach- loop. Maybe something like:

Code:

gen treated_before = 0 foreach j of local gr { cap noi psmatch2 bigchangetag $x $high_order $xv if g==`j' & treated_before!=1, n(1) logit qui common noreplacement replace treated_before = 1 if g<=`j' & bigchangetag==1 }

Last edited by Sebastian Geiger; 23 May 2016, 07:55.
Comment

Zhang_Lu

Join Date: Oct 2014
Posts: 155

24 May 2016, 07:22

Though not follow your example exactly ,I still inspired by the approach and largely fulfill my goal. Thank you @Sebastian

Code:

gen treated_before = 0
  label var matched "matched pairs by psmatch2"
  levels g, local(gr)
   foreach j of local gr {
     cap noi psmatch2 bigchangetag $x $high_order $xv if g==`j'&treated_before==0, n(1) logit qui common noreplacement
 
     ** Collect the treated year, treated firm can only be matched ONCE
     by nfid (treatment),sort: gen temp = (_treated==1)
     by nfid: egen num_treated = total(temp)
     replace treated_before = num_treated if treated_before==0
     // updated treated status, only change those who have not been treated before
     // "freeze" the panel units once it has been treated in one period

     drop temp
}

Comment

shahla ebrahimi

Join Date: Apr 2015

Posts: 30
#8

17 Sep 2018, 00:37

Zhang_Lu I would appreciate if you could provide me an example of how to do PSM with panel data and then, use DID?

Thanks in advance.
Comment

Announcement

PSM with panel data uisng --psmatch2--, a problem.

Comment

Comment

Comment

Comment

Comment

Comment

Comment