Triple Differences-in-Differences with two-way clustered standard errors

Andrew Dickens

Join Date: May 2014

Posts: 9
#1

Triple Differences-in-Differences with two-way clustered standard errors

07 Mar 2015, 11:42

Hi all,

I am running Stata 12 and curious what is the most efficient method of estimating a triple diff-in-diff model with Cameron, Gelbach and Miller standard errors? I am currently using -cgmreg-combined with -xi- to estimate my model, but this requires 2 hours for each permutation of my regression because I am explicitly estimating a few thousand dummy variables. I'm not aware of a way to use -areg- or -xtreg- because I require two-way clustered standard errors.

For example, imagine a dependent variable (y) and the treatment variable (x) that both vary by three dimensions: country, group and year. I would estimate this model using the following:

Code:

xi: cgmreg y x country##group country##year group##year, cluster(group country)

The only estimate I care about retrieving is the treatment variable x. I would appreciate thoughts about a better way to estimate this model if possible. Thank you.
Tags: None
Stephen Jenkins

Join Date: Apr 2014

Posts: 1435
#2

08 Mar 2015, 06:22

Andrew: welcome to the Forum, but please re-read the FAQ. Note in particular the injunction to provide complete bibliographic references, and also the source of user-written programs. This is a multi-disciplinary list. Also, the xi: prefix is outmoded; can factor variables not be used here? help fvvarlist
Comment
Mark Schaffer

Join Date: Mar 2014

Posts: 324
#3

08 Mar 2015, 06:27

If all you care about is the coefficient on x, then you should just partial out the country/year dummies and interactions. You would do this by something like

Code:

reg y country##group country##year group##year predict double yresid, resid reg x country##group country##year group##year predict double xresid, resid cgmreg yresid xtresid, nocons cluster(group country)

I don't know offhand what cgmreg does about small-sample dof adjustments. If it makes some kind of adjustment, then you'll need to adjust the SEs by hand to mimic since cgmreg won't know about the partialled-out variables.

BTW, ivreg2 and xtivreg2 support 2-way clustered SEs as well as a partialling-out option. We will shortly be releasing a version of ivreg2 that supports factor variables (finally!) that would mean you can do all of the above in one line, but for now you can do it by hand as above.
Comment
Sergio Correia

Join Date: Apr 2014

Posts: 420
#4

08 Mar 2015, 16:26

Hi Andrew,

Two comments:

i) I believe -cgmreg- does not take into account when the regressors are collinear. When Stata drops regressors due to collinearity (which probably happens *a lot* with all your FEs), you shouldn't use the dropped regressors to calculate the degrees of freedom. However, I recall -cgmreg- ignores that (which is usually not an issue since most people only have a few varialbes so N-K is close to N).

ii) What Stata of version are you using?

Best,
Sergio
Comment
Andrew Dickens

Join Date: May 2014

Posts: 9
#5

09 Mar 2015, 13:46

Thanks for the comments and suggestions.

Stephen, apologies for not following protocol. In the future I will be sure to review the FAQ before posting.

Sergio, thanks for the comment, I will look into how cgmreg handles collinear regressors. To answer your question I am using Stata 12.

Mark, that's great to hear ivreg2 will soon support factor variables. In the mean time, I have tried following your suggestion but hoped you could elaborate your point about manually adjusting SEs with Stata. I've been looking through different Stata manuals and I believe this can be done using e(V) from ereturn and weighting this by the appropriate DOF adjustment? Is this how manual adjustments are typically made to SEs in Stata?
Comment
Mark Schaffer

Join Date: Mar 2014

Posts: 324
#6

09 Mar 2015, 18:43

Andrew - it depends on what cgmreg does and whether you want to mimic it. It's possible that it makes no small-sample adjustment relating to the number of regressors (this is what ivreg2 would do unless you specify the small option). If this is OK for you, then you can just partial out and use the cgmreg results directly. If you want to make a small sample adjustment to e(V), then it's up to you.
Comment
Andrew Dickens

Join Date: May 2014

Posts: 9
#7

09 Mar 2015, 19:36

Mark, thanks again for the comment. I tried your suggestion and partialed-out the factor variables and then used the cgmreg results directly. I then compared this to my original approach and the SE is considerably smaller when partialling-out without making any further adjustment, which to me suggests an adjustment is necessary. I also read that cgmreg uses a similar (but actually slightly different) adjustment to the small option available with ivreg2, which, if I am understanding this correctly, confirms that an adjustment is necessary before I can trust my results. Having never manually adjusted standard errors before with Stata I am now trying to understand how to do so, which if you have any insight on I would greatly appreciate.
Comment
Sergio Correia

Join Date: Apr 2014

Posts: 420
#8

09 Mar 2015, 21:38

Hi Andrew,

You could try the development version of -reghdfe-, which does multi-way clustering with the correct DoF adjustment (although the help files haven't been updated). In essence, -reghdfe- is basically -areg- on steroids. It allows any number of fixed effects in absorb() and of cluster variables through vce(cluster ...).

Since you don't have Stata 13 (the only version that supports installing from https), you would need to extract the files within this zipfile to a folder:

https://github.com/sergiocorreia/reg...e.zip?raw=true
Then do "net from FolderPath" , and then install that file.

To run your regression, you would need to do:
reghdfe y x , absorb(country#group country#year group#year) vce(cluster group country)

(Notice that I replaced your country##group variables for country#group, which still adds a fixed effect for every combination of (country,group).

Let me know how it goes if you use it.
Best,
Sergio
Comment
Andrew Dickens

Join Date: May 2014

Posts: 9
#9

10 Mar 2015, 06:29

Hi Sergio,

Your suggestion worked, thank you. I had my coefficient estimate within ~30 seconds versus the ~2 hour wait of my original approach! But I am curious, although I get the right coefficient estimate, the SE is still estimated slightly smaller than when I ran my original approach, i.e.,

Code:

cgmreg y x country##group country##year group##year, cluster(group country)

Do you know why this is happening? Is this because cgmreg does not take into account dropped collinear FE regressors when calculating the appropriate degrees of freedom adjustment?
Comment
Mark Schaffer

Join Date: Mar 2014

Posts: 324
#10

10 Mar 2015, 06:37

Andrew - it looks cgmreg makes a small-sample adjustment that accounts for the number of regressors. Because you are including the FEs by hand, it counts these FEs as regressors along with everything else. This is probably overconservative. For example, if I am not mistaken Stata's official xtreg + standard one-way clustering uses a small-sample adjustment that counts only the regressors in the model and not the FEs as well. You can probably not worry about doing any small-sample adjustment, esp. since you apparently have only one variable in the model (the treatment variable x).

That said, Sergio's suggestion to use reghdfe looks like the way to go - 2-way cluster-robust in a single command.
Comment
Sergio Correia

Join Date: Apr 2014

Posts: 420
#11

10 Mar 2015, 11:17

Hi Andrew,

Mark is right; -cgmreg- is being overtly conservative for several reasons:

1) Like he said, -xtreg- with clusters, when computing the VCV matrix, will not take into account the fixed effects in the "k" when multiplying the VCV matrix by N/(N-k). The rationale is that "the fixed effects are nested within the cluster" (i.e. each FE is estimated using only data within the cluster). We do the same check, and indeed, all of your FEs seem nested within the clusters, so we don't penalize for that. You could see that in the last table of the output where most FEs are marked as "redundant", and you can also run reghdfe with the dofadjustment(none) to get results closer to cgmreg.

2) Besides the "nested in cluster" adjustment, -cgmreg- does something odd which I believe is due to changes in Stata in the past. I think in the past collinear regressors were dropped, but now they are only tagged with the o. prefix and not dropped. However, cgmreg does the following to calculate k:

Code:

mat `rows' = rowsof(`xxinv')

Thus, they will penalize even for variables that are collinear (which will happen a lot in your specification).

Cheers,
Sergio
Comment
Andrew Dickens

Join Date: May 2014

Posts: 9
#12

10 Mar 2015, 12:32

Mark & Sergio,

This all makes sense to me now. I really appreciate all the help and comments.

Best,

Andrew
Comment
Mitch Downey

Join Date: Oct 2014

Posts: 24
#13

05 Apr 2015, 18:55

Hi Sergio,

The availability of two-way clustering within reghdfe would be terrific (reghdfe is already a tremendous asset). Unfortunately, the syntax you posted above isn't working for me. When I run:

Code:

clear sysuse auto drop if rep78 == . reghdfe price mpg gear_ratio, absorb(foreign rep78) vce(cluster foreign rep78)

I get an error: "invalid vce type or number of clusters"

I typed

Code:

adoupdate, update

and Stata tells me that reghdfe (at http://fmwww.bc.edu/repec/bocode/r) is up to date. Am I missing something? I am using Stata/MP 13.1 but should admit that I'm not great with computers, so maybe I'm doing something wrong that's actually very simple.

Thanks in advance. Also, thanks for making reghdfe available. It's very helpful.

-Mitch

Mitch Downey, Grad student, UCSD Economics
Comment

Sergio Correia

Join Date: Apr 2014
Posts: 420

#14

05 Apr 2015, 19:31

Hi Mitch,

It works on my end:

Code:

. clear

. sysuse auto
(1978 Automobile Data)

.
. drop if rep78 == .
(5 observations deleted)

.
. reghdfe price mpg gear_ratio, absorb(foreign rep78) vce(cluster foreign rep78)
Warning: VCV matrix was non-positive semi-definite; adjustment from Cameron, Gelbach & Miller applied.
Warning: Some variables were dropped by the F test due to collinearity (or insufficient number of clusters).

HDFE Linear regression                            Number of obs   =         69
Absorbing 2 HDFE indicators                       F(   1,      1) =       4.70
Statistics robust to heteroskedasticity           Prob > F        =     0.2752
                                                  R-squared       =     0.3636
                                                  Adj R-squared   =     0.2906
Number of clusters (foreign) =          2         Within R-sq.    =     0.3542
Number of clusters (rep78)   =          5         Root MSE        =  2453.0682

                          (Std. Err. adjusted for 2 clusters in foreign rep78)
------------------------------------------------------------------------------
             |               Robust
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |  -170.0455   41.53084    -4.09   0.152    -697.7449    357.6539
  gear_ratio |  -3434.423   1584.705    -2.17   0.275    -23570.01    16701.17
------------------------------------------------------------------------------

------------------------------------------------------------------------------
 Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     |    Corr. w/xb
-------------+-------------------------------------------------+--------------
   i.foreign |            0               2              2     |       -0.6791
     i.rep78 |            0               5              5     |       -0.1573
------------------------------------------------------------------------------

Now, you only have two clusters on foreign so strange things can be going on. For instance, if you use an alternative-but-equivalent method of obtaining the VCE (that uses avar.ado instead of mwc.ado), reghdfe cannot compute standard errors:

Code:

reghdfe price mpg gear_ratio, absorb(foreign rep78) vce(cluster foreign rep78, suite(avar))

Finally, have you trying using the github version? htttps://github.com/sergiocorreia/reghdfe . It fixed a small bug that might make the difference in your case.
To install it, run:

Code:

cap ado uninstall reghdfe
net from https://raw.githubusercontent.com/sergiocorreia/reghdfe/master/package/
net install reghdfe

Let me know how it goes.

Best,
Sergio

Last edited by Sergio Correia; 05 Apr 2015, 19:33.

Comment

Mitch Downey

Join Date: Oct 2014

Posts: 24
#15

05 Apr 2015, 19:53

Hi Sergio,

Thanks for the quick response. I tried the code you posted to download from github, and everything worked great (I had to install tuples, but that's all). It worked for both the toy example and my real application. Thanks for the help!

-Mitch

Mitch Downey, Grad student, UCSD Economics
Comment

Announcement