Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • PSM with panel data uisng --psmatch2--, a problem.

    Hey, My recent project use a "PSM+DID" empirical design and my dataset is longitudinal. The panel data structure give me some strength in the empirical identification, but also cause some trouble in my data management practice,especially, how to implement PSM on panel data correctly. A template for me is Heyman et al. (JIE,2007), in that paper, they implement a year-by-year psm on "whether a firm is foregin obtained". I rely on the popular user-written command --psmatch2--. Following is a snippet of my code
    Code:
      **group by years
      egen g = group(year)
      levels g, local(gr)
      * Note that in each loop, psmatch2 replace its _variables (_treated _weight)
      * So it's necessary to record them in a NEW variable
       foreach j of local gr {
         cap noi psmatch2 bigchangetag $x $high_order $xv if g==`j', n(1) logit qui common noreplacement
         ** Collect the treated year
         by nfid (treatment),sort: gen temp = (_treated==1)
         by nfid: egen num_treated = total(temp)
         by nfid (temp)     ,sort: replace treat_year1 = year[_N] if treat_year1==.&num_treated==1
         drop temp num_treated
         ** Collect the (untreated) match year
         by nfid (treatment),sort: gen temp = (_treated==0)
         by nfid: egen num_treated = total(temp)
         by nfid (temp)     ,sort: replace treat_year2 = year[_N] if treat_year2==.&num_treated==1
         drop temp num_treated
    
        
         replace treatment     =_treated  if treatment==.
         replace pairs         =_id       if pairs==.
         replace matched       =_weight  if matched==.
         tab year _treated if matched==1
     }
    What I want is to obtain the _treated (indicating treatment and control group), _weight(indicating whether the obs is used for match) and obtain the year when the treatment happened. The tricky issue is , in each loop, psmatch2 "refresh" these _variables (_treated _weight), so it's necessary to record them in a NEW variable, that's exactly what I did.
    what worries me is , after the code was executed and sent me a series treatment variables, namely, treatment_1 - treatment_6. For I have specified the noreplacement option, it's more likely that each panel units (in my case, nfid) are used only once, occationally towice. However, the generated matched sample is like

    Code:
     nfid    year  treatment  pairs(_id) treat_year
    161     2000    0        1036589    1999
    161     2001    0        1029618    1999
    161     2002    0        1050054    1999
    161     2003    .        1010596    1999
    164     1998    .        1695000    1999
    164     1999    0          80890    1999
    164     2000    0         879366    1999
    164     2001    0         781947    1999
    164     2002    0         785361    1999
    164     2003    .         957154    1999
    169     2003    .        1681113    2004
    169     2004    0        1053593    2004
    171     1998    .        1548697    2000
    171     1999    0         102531    2000
    171     2000    1         952717    2000
    171     2001    0         889980    2000
    171     2002    0         848699    2000
    171     2003    .         882134    2000
    171     2004    0         552188    2000
    173     1998    .        1674613    1999
    173     1999    0          79626    1999
    176     1998    .        1995491    1999
    176     1999    0          40486    1999
    176     2000    0         405328    1999
    179     1998    .        1963984    1999
    179     1999    0          55616    1999
    179     2000    0         515677    1999
    179     2001    0         622819    1999
    179     2002    0         610122    1999
    179     2003    .         597080    1999
    179     2004    0         446528    1999
    188     1998    .        1900139    1999
    188     1999    1         117730    1999
    188     2000    0         916699    1999
    188     2001    0        1148184    1999
    188     2002    0        1174944    1999
    188     2003    .        1321465    1999
    193     1998    .        1544959    1999
    193     1999    1         117069    1999
    193     2000    0         669927    1999
    193     2001    0         907363    1999
    193     2002    0         842696    1999
    193     2003    .         730017    1999
    The results worries me since many units are used as matched sample in "EVERY YEAR". It's interesting because they should be not. I was supposed to obtain something like
    Code:
    nfid    year  treatment  pairs(_id) treat_year
    
    188     1998    .        1900139    1999
    188     1999    1         117730    1999
    188     2000    .         916699     1999
    188     2001    .        1148184    1999
    188     2002    .        1174944    1999
    I don't know if there is something wrong in my code. So please help me to check my code and figure out what's going on. Thank you

  • #2
    I think I need to make my point more clearly, How can I make the --psmatch2-- use the treated sample ONLY ONCE? I mean if a panel unit contain is treated in one period, then it SHALL NOT be selected as control group for other treated sample in other period. I believe it's more reasonable in panel data PSM (Am I right?) Then how to achieve it ?
    Last edited by Zhang_Lu; 21 May 2016, 20:56.

    Comment


    • #3
      This may oversimplify things, but the first idea I had was to identify the previously treated observations with an additional dummy, and then just use the -if- option of psmatch2 like "if prev_treated!=1". In this case, previously treated individuals are excluded from the sample, and thus will not be used as control observation.

      For the effects of restrictions on the sample you may consider reading this paper: http://doku.iab.de/discussionpapers/2008/dp1208.pdf

      Comment


      • #4
        Originally posted by Sebastian Geiger View Post
        This may oversimplify things, but the first idea I had was to identify the previously treated observations with an additional dummy, and then just use the -if- option of psmatch2 like "if prev_treated!=1". In this case, previously treated individuals are excluded from the sample, and thus will not be used as control observation.

        For the effects of restrictions on the sample you may consider reading this paper: http://doku.iab.de/discussionpapers/2008/dp1208.pdf
        Thanks,the suggestion and literaure your provide is valuable

        Comment


        • #5
          Originally posted by Sebastian Geiger View Post
          This may oversimplify things, but the first idea I had was to identify the previously treated observations with an additional dummy, and then just use the -if- option of psmatch2 like "if prev_treated!=1". In this case, previously treated individuals are excluded from the sample, and thus will not be used as control observation.

          For the effects of restrictions on the sample you may consider reading this paper: http://doku.iab.de/discussionpapers/2008/dp1208.pdf
          The problem is how to implement this idea with --psmatch2-- ? Say , I can define an dummy indicating whether a firm has ever been treated in the whole observation period, and adjust my code like
          Code:
          **group by years
            egen g = group(year)
            levels g, local(gr)
            * Note that in each loop, psmatch2 replace its _variables (_treated _weight)
            * So it's necessary to record them in a NEW variable
             foreach j of local gr {
               cap noi psmatch2 bigchangetag $x $high_order $xv if g==`j'&dummy~=., n(1) logit qui common noreplacement
              
           }
          Then the psmatch will not choose those firms with at least one treatment neither as treatment group nor control group. But I actually want them be treatment group , but not control group, so you can see the difference. I still don't figure out how to choose treatment and control group separately, if it's possible with --psmatch2--
          Last edited by Zhang_Lu; 22 May 2016, 21:02.

          Comment


          • #6
            I don't know the exact structure of your dataset, therefore I'm a little hesitant to provide you a concrete code to implement this idea. Nevertheless, I will try it, but you should check it for any mistakes. One way to implement this idea would be the modify the dummy variable which shows if the individual has been treated before (!) within the -foreach- loop. Maybe something like:


            Code:
             gen treated_before = 0    
            
            foreach j of local gr {    
             
            cap noi psmatch2 bigchangetag $x $high_order $xv if g==`j' & treated_before!=1, n(1) logit qui common noreplacement  
            
            replace treated_before = 1 if g<=`j' & bigchangetag==1  
            }
            Last edited by Sebastian Geiger; 23 May 2016, 07:55.

            Comment


            • #7
              Though not follow your example exactly ,I still inspired by the approach and largely fulfill my goal. Thank you @Sebastian
              Code:
              gen treated_before = 0
                label var matched "matched pairs by psmatch2"
                levels g, local(gr)
                 foreach j of local gr {
                   cap noi psmatch2 bigchangetag $x $high_order $xv if g==`j'&treated_before==0, n(1) logit qui common noreplacement
               
                   ** Collect the treated year, treated firm can only be matched ONCE
                   by nfid (treatment),sort: gen temp = (_treated==1)
                   by nfid: egen num_treated = total(temp)
                   replace treated_before = num_treated if treated_before==0
                   // updated treated status, only change those who have not been treated before
                   // "freeze" the panel units once it has been treated in one period
              
                   drop temp
              }

              Comment


              • #8
                Zhang_Lu I would appreciate if you could provide me an example of how to do PSM with panel data and then, use DID?

                Thanks in advance.

                Comment

                Working...
                X