Causal inference for a policy analysis using Difference in Difference

Sid Agrawal

Join Date: Feb 2023

Posts: 3
#1

Causal inference for a policy analysis using Difference in Difference

04 Feb 2023, 10:47

I am working on analyzing whether a policy implemented in 2011 impacted the companies that adopted it. Below are the details:

Dataset:
The dependent variable Y is measured from 2006 to 2016

Treatment Year: 2011

#Companies that implemented the policy = 33

#Companies that did not implement the policy = 335

The goal is two folds:
Does the policy have an impact on Y?

For the companies which implemented the policy: For the dependent variable change from 2006 to 2016, what % of change was due to the policy implementation?

What I have done so far:

I have created three variables:
Treatment variable ('treatment'): 0 assigned to companies in the control group and 1 assigned to companies in the treatment group

Time Variable ('time'): 0 for years 2006 to 2010 and 1 for years from 2011 to 2016

Treatment*Time variable ('treatment_time'): It is 1 for companies in the treatment group and time>=2011

In addition to it, I have 3 more variables:
companyID - one for each company in the dataset

year - I have recoded it such that 2006 is 1, 2007 is 2, ....., and 2016 is 11.

y - dependent variable

Models I have built:

I am trying to run the following code, but I am not sure if I am doing it correctly:
xtreg y time##treatment, r

reg incrate post_t##treatment, r

reg y time treatment treatment_time, r

xtdidregress (y) (treatment_time), group(companyid) time(year)

Can you please guide me on the best way to address the problem? Thank you for the help.
Tags: difference-in-difference, panel data
Jared Greathouse

Join Date: Sep 2021

Posts: 2172
#2

04 Feb 2023, 13:57

While this is actually a well described dataset, we do not want written descriptions of data by itself, we need data and code, see FAQ for more please.

I guess my question for you, is what's the problem here? Is Stata doing something you didn't expect? Do you want it to do something it isn't doing? I only have one more thing to comment, otherwise. Welcome to Statalist!!
Comment

Sid Agrawal

Join Date: Feb 2023
Posts: 3

04 Feb 2023, 14:57

Hi Jared. Thank you for the warm welcome and for guiding me on how to be specific while asking questions on this forum. I am new at using STATA and inn using Difference-in-Difference method. I am unsure if I am implementing the model correctly and how to interpret the results. Below are the code and output for the two models I have run:

Model 1: Using 'xtreg'

Code:

. xtreg y time##treatment,re

Random-effects GLS regression                   Number of obs     =      3,718
Group variable: companyid                          Number of groups  =        338

R-squared:                                      Obs per group:
     Within  = 0.0119                                         min =         11
     Between = 0.0221                                         avg =       11.0
     Overall = 0.0162                                         max =         11

                                                Wald chi2(3)      =      48.20
corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000

----------------------------------------------------------------------------------
         y | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-----------------+----------------------------------------------------------------
        1.time |  -.7162854   .1162033    -6.16   0.000    -.9440396   -.4885312
     1.treatment |  -1.466409   .5456852    -2.69   0.007    -2.535933    -.396886
                 |
time#treatment |
            1 1  |      .1354   .3776606     0.36   0.720    -.6048011    .8756011
                 |
           _cons |   4.005784   .1679031    23.86   0.000       3.6767    4.334868
-----------------+----------------------------------------------------------------
         sigma_u |  2.5244368
         sigma_e |  3.3569377
             rho |  .36123166   (fraction of variance due to u_i)
----------------------------------------------------------------------------------

Model 2: Using 'xtdidregress'

Code:

 xtdidregress (y) (treatment_time), group(companyid) time(year1)

Number of groups and treatment time

Time variable: year1
Control:       treatment_time = 0
Treatment:     treatment_time = 1
-----------------------------------
             |   Control  Treatment
-------------+---------------------
Group        |
      mineid |       306         32
-------------+---------------------
Time         |
     Minimum |         4          9
     Maximum |         4          9
-----------------------------------

Difference-in-differences regression                     Number of obs = 3,718
Data type: Longitudinal

                                   (Std. err. adjusted for 338 clusters in mineid)
----------------------------------------------------------------------------------
                 |               Robust
         y | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-----------------+----------------------------------------------------------------
ATET             |
treatment_time|
       (1 vs 0)  |      .1354   .2554092     0.53   0.596     -.366997     .637797
----------------------------------------------------------------------------------
Note: ATET estimate adjusted for panel effects and time effects.

For Model 2, I also ran the following tests:

Code:

. estat ptrends

Parallel-trends test (pretreatment time period)
H0: Linear trends are parallel

F(1, 337) =   0.13
 Prob > F = 0.7142

. estat granger

Granger causality test
H0: No effect in anticipation of treatment

F(4, 337) =   1.64
 Prob > F = 0.1646

Here is the output of

Code:

estat trendplot

Graph.gph

I needed help in interpreting the results. I also wonder whether there is a better method to answer the question. Thank you.

Comment

Jared Greathouse

Join Date: Sep 2021

Posts: 2172
#4

05 Feb 2023, 09:22

My better method I would recommend here is sdid. You can get it from SSC. People oftentimes recommend scul (which I wrote) for instances like this, but it isn't a good idea because scul needs a moderate pre-intervention period, and you don't have that.

there's a new paper describing sdid for Stata, and I feel it offers a more rigorous test than normal DD.
Comment
Sid Agrawal

Join Date: Feb 2023

Posts: 3
#5

06 Feb 2023, 19:36

Thank you, Jared!
Comment

Announcement