how to control for individual fixed effects in Stata?

Yao Zhao

Join Date: Feb 2017

Posts: 226
#1

how to control for individual fixed effects in Stata?

02 Apr 2020, 00:27

I know one method can be:

Code:

regress y x1 x2 i.id, vce(robust)

But the problem is in my data set, I have a lot of id. And it kills my Stata.
Tags: None
Andrew Musau

Join Date: Oct 2014

Posts: 10214
#2

02 Apr 2020, 01:22

For panel data, you want to cluster your standard errors at the individual level. The -robust- option is thus valid for xtreg but not regress.

Code:

regress y x1 x2, absorb(id) cluster(id) xtset id time xtreg y x1 x2, fe cluster(id)
Comment
Yao Zhao

Join Date: Feb 2017

Posts: 226
#3

02 Apr 2020, 06:54

I don't find absorb option in -help regress-. I only find absorb option in -help areg-.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#4

02 Apr 2020, 08:25

Yao:
surely Andrew meant -areg- instead of -regress-.

Kind regards,
Carlo
(Stata 19.0)
Comment
David HajHooj

Join Date: Apr 2020

Posts: 18
#5

02 Apr 2020, 09:26

Hello Yao,
ssc install reghdfe
For regression:

reghdfe x y1 y2 y3, a(id) vce(cluster id)
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17712

02 Apr 2020, 09:42

If Yao has many -panelid- but a fixed effect only, -areg-, -xtreg- and the community-contributed programme -reghdfe- give back similar results:

Code:

. use "https://www.stata-press.com/data/r16/nlswork.dta"
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

. areg ln_wage age, absorb(idcode) vce(cluster idcode)

Linear regression, absorbing indicators         Number of obs     =     28,510
Absorbed variable: idcode                       No. of categories =      4,710
                                                F(   1,   4709)   =     738.02
                                                Prob > F          =     0.0000
                                                R-squared         =     0.6636
                                                Adj R-squared     =     0.5970
                                                Root MSE          =     0.3035

                             (Std. Err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0181349   .0006675    27.17   0.000     .0168262    .0194436
       _cons |   1.148214   .0193889    59.22   0.000     1.110202    1.186225
------------------------------------------------------------------------------

. xtset idcode year
       panel variable:  idcode (unbalanced)
        time variable:  year, 68 to 88, but with gaps
                delta:  1 unit

. xtreg ln_wage age, fe vce(cluster idcode)

Fixed-effects (within) regression               Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,710

R-sq:                                           Obs per group:
     within  = 0.1026                                         min =          1
     between = 0.0877                                         avg =        6.1
     overall = 0.0774                                         max =         15

                                                F(1,4709)         =     884.05
corr(u_i, Xb)  = 0.0314                         Prob > F          =     0.0000

                             (Std. Err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0181349   .0006099    29.73   0.000     .0169392    .0193306
       _cons |   1.148214   .0177153    64.81   0.000     1.113483    1.182944
-------------+----------------------------------------------------------------
     sigma_u |  .40635023
     sigma_e |  .30349389
         rho |  .64192015   (fraction of variance due to u_i)
------------------------------------------------------------------------------


. reghdfe ln_wage age, absorb(idcode) vce(cluster idcode)
(dropped 551 singleton observations)
(converged in 1 iterations)

HDFE Linear regression                            Number of obs   =     27,959
Absorbing 1 HDFE group                            F(   1,   4158) =     884.06
Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                  R-squared       =     0.6540
                                                  Adj R-squared   =     0.5936
                                                  Within R-sq.    =     0.1026
Number of clusters (idcode)  =      4,159         Root MSE        =     0.3035

                             (Std. Err. adjusted for 4,159 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0181349   .0006099    29.73   0.000     .0169391    .0193307
------------------------------------------------------------------------------

Absorbed degrees of freedom:
---------------------------------------------------------------+
 Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     |
-------------+-------------------------------------------------|
      idcode |            0            4159           4159 *   |
---------------------------------------------------------------+
* = fixed effect nested within cluster; treated as redundant for DoF computation

Kind regards,
Carlo
(Stata 19.0)

Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10214

02 Apr 2020, 10:09

Carlo Lazzaro, the -absorb- option in regress is undocumented. I suspect that the estimator implements areg if you specify this option.

Code:

. use "https://www.stata-press.com/data/r16/nlswork.dta"
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

.
. regress ln_wage age, absorb(idcode) vce(cluster idcode)

Linear regression, absorbing indicators         Number of obs     =     28,510
                                                F(0, 4709)        =          .
                                                Prob > F          =          .
                                                R-squared         =     0.6636
                                                Adj R-squared     =     0.5970
                                                Root MSE          =     .30349

                             (Std. Err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0181349   .0006675    27.17   0.000     .0168262    .0194436
       _cons |   1.148214   .0193889    59.22   0.000     1.110202    1.186225
------------------------------------------------------------------------------

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#8

02 Apr 2020, 10:14

Andrew Musau :
many thanks for pointing this out.
Admittedly, I was not aware of the availability of the -absorb()- option for -regress-, too.
As a consequence, my previous reply #4 reveals my lack of knowledge about this option and is not helpful for the original poster.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10214
#9

02 Apr 2020, 10:58

Your example in #7 is very helpful as it illustrates that you get the same results whether you use areg, xtreg or reghdfe. Your remark in #4 may very well be true for earlier versions of Stata. I have only checked this in Stata 16.

Last edited by Andrew Musau; 02 Apr 2020, 11:00.
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#10

02 Apr 2020, 11:02

Andrew Musau:
very interesting thread.
Thanks, as usual, for your contributions.

Kind regards,
Carlo
(Stata 19.0)
Comment
Yao Zhao

Join Date: Feb 2017

Posts: 226
#11

02 Apr 2020, 19:51

whether you use areg, xtreg or reghdfe doesn't affect your point estimate. But the standard error still have a little bit difference.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#12

03 Apr 2020, 01:50

Yao:
yes, I meant that point estimates are the same. As expected, standard errors differ.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

how to control for individual fixed effects in Stata?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment