Two Way Mundlak in Pooled Model

Keegan Robertson

Join Date: Dec 2023

Posts: 12
#1

Two Way Mundlak in Pooled Model

14 Dec 2023, 01:32

I'm trying to implement the Two-Way Mundlak estimator as per Jeff Wooldridge 2021 on a pooled ols model to get to an equal outcome as a TWFE model.

I try regress and reghdfe as my comparison commands, and I get equal results when I add the unit fixed effect (just a one-way FE) but not when I add the time average. Where am I going wrong?

I'm not 100% sure if it is intended that regress with two-way mundlak becomes the equivalent of the reghdfe (reghdfe and xtreg, fe generate equivalent results but reg is much quicker so I use that here), or if it should be xtreg, re with TWM (which I also tried and it also works for one-way FE but not TWFE)

A couple of notes about my data: T=58 N=~4000. As a panel it is highly unbalanced (some Ns have only 1 T), however, no data is missing. I've seen Correia's note about singletons but I tried both ways and the difference does not affect interpretation in my dataset.

Code:

import delimited "dat.csv", delimiter(",") * mean x by district across years (unit mean) egen x_dist_mean = mean(x), by(dist) * mean x across districts by year (time mean) egen x_yr_mean = mean(x), by(yr) *Compare District FE across various approaches - Coefficients are equivalent but errors are higher for Regress regress y x x_dist_mean reghdfe y x, absorb(i.dist) keepsingletons *Compare TWFE across approaches - regress is out by quite a bit regress y x x_dist_mean x_yr_mean reghdfe y x, absorb(i.dist i.yr) keepsingletons
Tags: fixed effects, mundlak, regression
Keegan Robertson

Join Date: Dec 2023

Posts: 12
#2

14 Dec 2023, 01:47

I should have mentioned the reason this is relevant is that I am trying to apply the TWFE to a fracreg logit model, but it is simply too much to add them as categorical variables, so I thought I could apply the TWM. The results from the regress and the margins of fracreg logit are equivalent so if I can get the regress with TWM to mirror the TWFE output then I would feel safer applying the TWM to the fracreg approach. If this seems wrong let me know! Professor Wooldridge's paper does indicate that fractional models should work.
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10168

14 Dec 2023, 01:53

You should get the same coefficient in a balanced panel. See below with the Grunfeld dataset.

Code:

webuse grunfeld, clear
rename (invest mvalue company year) (y x dist yr)
mean x by district across years (unit mean)
egen x_dist_mean = mean(x), by(dist)

* mean x across districts by year (time mean)
egen x_yr_mean = mean(x), by(yr)

*Compare District FE across various approaches - Coefficients are equivalent but errors are higher for Regress
regress y x x_dist_mean
reghdfe y x, absorb(i.dist) keepsingletons

*Compare TWFE across approaches - regress is out by quite a bit
regress y x  x_dist_mean x_yr_mean
reghdfe y x, absorb(i.dist i.yr) keepsingletons

Res.:

Code:

. *Compare TWFE across approaches - regress is out by quite a bit

. 
. regress y x  x_dist_mean x_yr_mean

      Source |       SS           df       MS      Number of obs   =       200
-------------+----------------------------------   F(3, 196)       =    186.95
       Model |  6936060.29         3   2312020.1   Prob > F        =    0.0000
    Residual |  2423883.62       196  12366.7532   R-squared       =    0.7410
-------------+----------------------------------   Adj R-squared   =    0.7371
       Total |  9359943.92       199  47034.8941   Root MSE        =    111.21

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
           x |   .1799679   .0283179     6.36   0.000     .1241211    .2358147
 x_dist_mean |  -.0420708   .0289906    -1.45   0.148    -.0992443    .0151028
   x_yr_mean |    .029871    .049165     0.61   0.544    -.0670893    .1268314
       _cons |   -35.5134   54.17701    -0.66   0.513    -142.3581    71.33133
------------------------------------------------------------------------------

. 
. reghdfe y x, absorb(i.dist i.yr) keepsingletons
WARNING: Singleton observations not dropped; statistical significance is biased (link)
(MWFE estimator converged in 2 iterations)

HDFE Linear regression                            Number of obs   =        200
Absorbing 2 HDFE groups                           F(   1,    170) =      76.08
                                                  Prob > F        =     0.0000
                                                  R-squared       =     0.8808
                                                  Adj R-squared   =     0.8604
                                                  Within R-sq.    =     0.3092
                                                  Root MSE        =    81.0287

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
           x |   .1799679   .0206334     8.72   0.000     .1392372    .2206986
       _cons |  -48.70963    23.0425    -2.11   0.036    -94.19591   -3.223356
------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------+
 Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------+---------------------------------------|
        dist |        10           0          10     |
          yr |        20           1          19     |
-----------------------------------------------------+

Even in the case of demeaning + OLS estimation, you need iterative demeaning for unbalanced panels to obtain equivalent results (coefficients) to the within-estimator. See https://www.statalist.org/forums/for...-time-variable.

Comment

Keegan Robertson

Join Date: Dec 2023

Posts: 12
#4

14 Dec 2023, 05:40

Thank you Andrew.

I followed your comments about iterative demeaning on the other thread you linked and it works perfectly. The next issue that I did not foresee, is that the demeaned result of the dependent mdy contains values outside of 0 and 1 where the original y was ratio bounded. This makes my fractional approach impossible. Do you have any thoughts on how to tackle this? In other words, the purpose of this line of inquiry is to be able to apply TWFE to fracreg.

For reference here's what I ended up with:

Code:

foreach var in x y{ bys dist: egen mc`var'= mean(`var') bys yr: egen my`var'= mean(`var') gen md`var'=`var' -mc`var'- my`var' drop mc`var' my`var' } forvalues i=1/4130 { qui { foreach var in x y{ bys dist: egen mc`var'= mean(md`var') replace md`var'=md`var' -mc`var' bys yr: egen my`var'= mean(md`var') replace md`var'=md`var' -my`var' drop mc`var' my`var' } } } regress mdy mdx fracreg logit mdy mdx margins, dydx(*)
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2455
#5

14 Dec 2023, 05:46

Hi Keegan
try this
net install cre, from("https://friosavila.github.io/stpackages")
then:

cre, abs(dist yr):regress y x

This should do what you need
Comment

Keegan Robertson

Join Date: Dec 2023
Posts: 12

14 Dec 2023, 06:47

Thanks Fernando, I've given cre a try and while it does get to the same coefficient on x the errors and p-val are quite different than the other approaches. It also doesn't seem to be able to handle fracreg?

Code:

foreach var in x y{
bys dist: egen mc`var'= mean(`var')
bys yr: egen my`var'= mean(`var')
gen md`var'=`var' -mc`var'- my`var'
drop mc`var' my`var'
}

forvalues i=1/4130 {
qui {
foreach var in x y{
bys dist: egen mc`var'= mean(md`var')
replace md`var'=md`var' -mc`var'
bys yr: egen my`var'= mean(md`var')
replace md`var'=md`var' -my`var'
drop mc`var' my`var'
}
}
}

regress mdy mdx
reghdfe y x, absorb(i.dist i.yr) keepsingletons 
xtset dist yr
xtreg y x i.yr, fe
cre, abs(dist yr) keepsingletons: regress y x
cre, abs(dist yr) keepsingletons: fracreg logit y x

That leads to:

Code:

. regress mdy mdx

      Source |       SS           df       MS      Number of obs   =    19,398
-------------+----------------------------------   F(1, 19396)     =      5.90
       Model |  .050391905         1  .050391905   Prob > F        =    0.0152
    Residual |   165.75005    19,396  .008545579   R-squared       =    0.0003
-------------+----------------------------------   Adj R-squared   =    0.0003
       Total |  165.800442    19,397  .008547736   Root MSE        =    .09244

------------------------------------------------------------------------------
         mdy | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         mdx |  -.0045681   .0018812    -2.43   0.015    -.0082553   -.0008809
       _cons |  -1.93e-11   .0006637    -0.00   1.000     -.001301     .001301
------------------------------------------------------------------------------

. reghdfe y x, absorb(i.dist i.yr) keepsingletons 
WARNING: Singleton observations not dropped; statistical significance is biased (link)
(MWFE estimator converged in 23 iterations)

HDFE Linear regression                            Number of obs   =     19,398
Absorbing 2 HDFE groups                           F(   1,  15213) =       4.63
                                                  Prob > F        =     0.0315
                                                  R-squared       =     0.7170
                                                  Adj R-squared   =     0.6392
                                                  Within R-sq.    =     0.0003
                                                  Root MSE        =     0.1044

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
           x |  -.0045681   .0021241    -2.15   0.032    -.0087315   -.0004046
       _cons |   .6638376   .0010854   611.63   0.000     .6617102    .6659651
------------------------------------------------------------------------------

Absorbed degrees of freedom:
-----------------------------------------------------+
 Absorbed FE | Categories  - Redundant  = Num. Coefs |
-------------+---------------------------------------|
        dist |      4130           0        4130     |
          yr |        56           2          54     |
-----------------------------------------------------+

. xtset dist yr

Panel variable: dist (unbalanced)
 Time variable: yr, 1902 to 2014, but with gaps
         Delta: 1 unit

. xtreg y x i.yr, fe
note: 2014.yr omitted because of collinearity.

Fixed-effects (within) regression               Number of obs     =     19,398
Group variable: dist                            Number of groups  =      4,130

R-squared:                                      Obs per group:
     Within  = 0.0602                                         min =          1
     Between = 0.0028                                         avg =        4.7
     Overall = 0.0137                                         max =         43

                                                F(55, 15213)      =      17.71
corr(u_i, Xb) = -0.1205                         Prob > F          =     0.0000

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
           x |  -.0045681   .0021241    -2.15   0.032    -.0087315   -.0004046
       _cons |   .6418146   .0111583    57.52   0.000     .6199431    .6636862
-------------+----------------------------------------------------------------
     sigma_u |  .14213149
     sigma_e |   .1043805
         rho |  .64963141   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(4129, 15213) = 8.95                 Prob > F = 0.0000


. cre, abs(dist yr) keepsingletons: regress y x
WARNING: Singleton observations not dropped; statistical significance is biased (link)

      Source |       SS           df       MS      Number of obs   =    19,398
-------------+----------------------------------   F(3, 19394)     =      4.42
       Model |  .400187334         3  .133395778   Prob > F        =    0.0041
    Residual |  585.297221    19,394  .030179294   R-squared       =    0.0007
-------------+----------------------------------   Adj R-squared   =    0.0005
       Total |  585.697409    19,397  .030195257   Root MSE        =    .17372

------------------------------------------------------------------------------
           y | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
           x |   -.004568   .0035352    -1.29   0.196    -.0114972    .0023612
        m1_x |  -.0019865   .0043263    -0.46   0.646    -.0104665    .0064935
        m2_x |   .0109237   .0047217     2.31   0.021     .0016688    .0201787
       _cons |   .6638376   .0018064   367.50   0.000      .660297    .6673783
------------------------------------------------------------------------------

. cre, abs(dist yr) keepsingletons: fracreg logit y x
variable logit not found
r(111);

Comment

FernandoRios

Join Date: Apr 2014

Posts: 2455
#7

14 Dec 2023, 09:19

good call.
So, it doesnt give you the same SE because it doesn't correct standard errors for you.
For instance, what you can do to replicate results with a single fixed effect its:

cre, abs(dist ) : regress y x, cluster(dist)

Now, to apply it to fracreg logit, you can do the following:
cre, abs(dist yr) keep: regress y x
fracreg logit y x m1_x m2_x, cluster(dist)

So, what CRE is doing is nothing else than creating the mean variables for you.

Hope this helps
Fernando
Comment
Keegan Robertson

Join Date: Dec 2023

Posts: 12
#8

14 Dec 2023, 10:07

Amazing! Yes, this appears to be working as I had hoped. Thank you so much. One last thing, I am trying to include an interaction in the fixed effects. I did this with reghdfe but

Code:

cre, abs(dist state#yr) keep: regress y x

gives an error "interactions not allowed", is there any way around that?
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2455
#9

14 Dec 2023, 10:24

that will work only if you create the Dummies by hand.
egen state_year = group(state year)
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2148
#10

14 Dec 2023, 14:13

Keegan: May I suggest my 2023 paper in the Econometrics Journal on nonlinear diff-in-diffs? These issues are discussed there. As you said, you can't include unit-specific FEs. So, in that paper, I show the cohort dummies allow identification of the ATTs in the nonlinear case under a certain conditional parallel trends assumption. Fractional logit is one of the examples.

As Andrew and Fernando pointed out, to get the TWFE = TWM result there are restrictions. It doesn't hold with time-varying regressors unless you add several more terms to the Mundlak regression. And, as already discussed, you have to be careful with an unbalanced panel.
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2455
#11

14 Dec 2023, 16:37

Also I should mention two things
1. the latest version of jwdid would allow you to use othe methods following the method of extended twfe.
2. It will not work with fractional logit just because this command doesn’t allow you to cluster standard errors. In that case you may want to use glm
f
Comment
Keegan Robertson

Join Date: Dec 2023

Posts: 12
#12

14 Dec 2023, 17:53

Thank you Jeff, Fernando, and Andrew,

Yes, I have started reading that paper (though not fully grasped it as yet). So far if I understand clearly, I have been calculating ATE rather than following a DID approach to calculate ATT (which I could potentially also do, but I need to dig deeper into your considerations of staggered-entry-staggered-exit scenarios).

Can I check my understanding of what I have generated following the cre approach that Fernando has shared:
I used an iterative demeaning approach to generate average Xs by district and by time, which are constructed from group-level averages of the explanatory variable(s) in both directions (by district and year)

I control for the means in a Fractional Logit regression with errors clustered at the district level

I collect the average marginal effects, which produces coefficients comparable to those of an OLS/linear TWFE model

Regarding the state-by-year effects I was querying. does this mean that the cre would be:

egen state_year = group(state year)
cre, abs (dist) keep: regress y x i.state_year, cluster(dist)
and then I just need to add the full list of m_state_year variables into the fracreg regression?
fracreg logit y x m1_1 ... mx_, vce(cl dist)

I'm hoping there might be something a bit more efficient that I am missing.

Jeff, I'm also reading your paper on choosing the level of fixed effects to consider if my last query regarding fitting the district effects alongside a state-by-year set of effects is an improvement on the district and year effects approach.
Comment
Keegan Robertson

Join Date: Dec 2023

Posts: 12
#13

14 Dec 2023, 18:02

Fernando it seems like fracreg logit does allow clustered errors like this:

Code:

fracreg logit y x, vce(cl dist)

or is that not doing what I think it is?
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2455
#14

14 Dec 2023, 18:15

You are right!
i need to add an exception!
will submit an update next week to handle that
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2148
#15

14 Dec 2023, 19:43

I’m on a flight. Tomorrow I’ll provide a link to a shared Dropbox where I show how to do it. I typically use glm but fracreg also works.
1 like
Comment

Announcement