Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Causal inference for a policy analysis using Difference in Difference

    I am working on analyzing whether a policy implemented in 2011 impacted the companies that adopted it. Below are the details:

    Dataset:
    • The dependent variable Y is measured from 2006 to 2016
    • Treatment Year: 2011
    • #Companies that implemented the policy = 33
    • #Companies that did not implement the policy = 335
    The goal is two folds:
    • Does the policy have an impact on Y?
    • For the companies which implemented the policy: For the dependent variable change from 2006 to 2016, what % of change was due to the policy implementation?
    What I have done so far:

    I have created three variables:
    • Treatment variable ('treatment'): 0 assigned to companies in the control group and 1 assigned to companies in the treatment group
    • Time Variable ('time'): 0 for years 2006 to 2010 and 1 for years from 2011 to 2016
    • Treatment*Time variable ('treatment_time'): It is 1 for companies in the treatment group and time>=2011
    In addition to it, I have 3 more variables:
    • companyID - one for each company in the dataset
    • year - I have recoded it such that 2006 is 1, 2007 is 2, ....., and 2016 is 11.
    • y - dependent variable
    Models I have built:

    I am trying to run the following code, but I am not sure if I am doing it correctly:
    • xtreg y time##treatment, r
    • reg incrate post_t##treatment, r
    • reg y time treatment treatment_time, r
    • xtdidregress (y) (treatment_time), group(companyid) time(year)
    Can you please guide me on the best way to address the problem? Thank you for the help.

  • #2
    While this is actually a well described dataset, we do not want written descriptions of data by itself, we need data and code, see FAQ for more please.

    I guess my question for you, is what's the problem here? Is Stata doing something you didn't expect? Do you want it to do something it isn't doing? I only have one more thing to comment, otherwise. Welcome to Statalist!!

    Comment


    • #3
      Hi Jared. Thank you for the warm welcome and for guiding me on how to be specific while asking questions on this forum. I am new at using STATA and inn using Difference-in-Difference method. I am unsure if I am implementing the model correctly and how to interpret the results. Below are the code and output for the two models I have run:

      Model 1: Using 'xtreg'

      Code:
      . xtreg y time##treatment,re
      
      Random-effects GLS regression                   Number of obs     =      3,718
      Group variable: companyid                          Number of groups  =        338
      
      R-squared:                                      Obs per group:
           Within  = 0.0119                                         min =         11
           Between = 0.0221                                         avg =       11.0
           Overall = 0.0162                                         max =         11
      
                                                      Wald chi2(3)      =      48.20
      corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
      
      ----------------------------------------------------------------------------------
               y | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
      -----------------+----------------------------------------------------------------
              1.time |  -.7162854   .1162033    -6.16   0.000    -.9440396   -.4885312
           1.treatment |  -1.466409   .5456852    -2.69   0.007    -2.535933    -.396886
                       |
      time#treatment |
                  1 1  |      .1354   .3776606     0.36   0.720    -.6048011    .8756011
                       |
                 _cons |   4.005784   .1679031    23.86   0.000       3.6767    4.334868
      -----------------+----------------------------------------------------------------
               sigma_u |  2.5244368
               sigma_e |  3.3569377
                   rho |  .36123166   (fraction of variance due to u_i)
      ----------------------------------------------------------------------------------
      Model 2: Using 'xtdidregress'

      Code:
       xtdidregress (y) (treatment_time), group(companyid) time(year1)
      
      Number of groups and treatment time
      
      Time variable: year1
      Control:       treatment_time = 0
      Treatment:     treatment_time = 1
      -----------------------------------
                   |   Control  Treatment
      -------------+---------------------
      Group        |
            mineid |       306         32
      -------------+---------------------
      Time         |
           Minimum |         4          9
           Maximum |         4          9
      -----------------------------------
      
      Difference-in-differences regression                     Number of obs = 3,718
      Data type: Longitudinal
      
                                         (Std. err. adjusted for 338 clusters in mineid)
      ----------------------------------------------------------------------------------
                       |               Robust
               y | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
      -----------------+----------------------------------------------------------------
      ATET             |
      treatment_time|
             (1 vs 0)  |      .1354   .2554092     0.53   0.596     -.366997     .637797
      ----------------------------------------------------------------------------------
      Note: ATET estimate adjusted for panel effects and time effects.

      For Model 2, I also ran the following tests:

      Code:
      . estat ptrends
      
      Parallel-trends test (pretreatment time period)
      H0: Linear trends are parallel
      
      F(1, 337) =   0.13
       Prob > F = 0.7142
      
      . estat granger
      
      Granger causality test
      H0: No effect in anticipation of treatment
      
      F(4, 337) =   1.64
       Prob > F = 0.1646
      Here is the output of
      Code:
      estat trendplot
      Graph.gph


      I needed help in interpreting the results. I also wonder whether there is a better method to answer the question. Thank you.

      Comment


      • #4
        My better method I would recommend here is sdid. You can get it from SSC. People oftentimes recommend scul (which I wrote) for instances like this, but it isn't a good idea because scul needs a moderate pre-intervention period, and you don't have that.

        there's a new paper describing sdid for Stata, and I feel it offers a more rigorous test than normal DD.

        Comment


        • #5
          Thank you, Jared!

          Comment

          Working...
          X