Fixed effects GLS (FEGLS)

Chiraz KARAMTI

Join Date: Oct 2016

Posts: 59
#1

Fixed effects GLS (FEGLS)

03 Jan 2017, 08:01

Hi,
Please I want to know wich command of STATA permits to estimate FEGLS (fixed effects GLS) as in Wooldridge 2002? Is it vce(cluster)? is it xtregar?
When it is a random effects model, it's easy, we use xtgls with the right variance structure and it's done. However when it is a fixed effect model, how to correct simultaneously for autocorrelation and heteroscedasticity in both within and between dimensions?
Thanks a lot, I really need clarifications.
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#2

03 Jan 2017, 08:04

Chiraz:
if you're dealing with a large N, short T panel dataset, -vce(cluster)- can handle both autocorrelation and heteroskedasticity,

Kind regards,
Carlo
(Stata 19.0)
Comment
Chiraz KARAMTI

Join Date: Oct 2016

Posts: 59
#3

04 Jan 2017, 04:42

Hi Carlo,
First happy new year.
Thank you for responding. I know that in STATA when we have both heteroscedasticity and autocorrelation we have to use vce(cluster). If we have just autocorrelation we use xtregar, with simply heteroscdasticitty we use robust.
Sorry if I didn't well express my self I'm from Tunisia so not an English native. My question is more theoritical. Wooldridge (2002, p277) explained the fixed effects GLS procedure which consists in estimating FE and then take the residuals, drop an observation, estimate the variance etc.... Is that what the command vce(cluster) really do? what's the theoritical background under this command?can we say in a paper that we are using a generalized fixed effect when we correct problems with vce(cluster)?
Thank you for your help, I'm giving a course on this and I really don't want to give wrong explanations to my students.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#4

04 Jan 2017, 06:36

Chiraz:
again, the main question is if you're talking about a large N, small T panel dataset or the other way round.
As an aside, please note that, under -xt-, -vce(robust)- and -vce(cluster)- do the same jobs.
As far as -xtregar. is concerned, it seems more suitable for small N, large T panel dataset.
I reciprocate all the best for 2017.

Kind regards,
Carlo
(Stata 19.0)
Comment
Chiraz KARAMTI

Join Date: Oct 2016

Posts: 59
#5

06 Jan 2017, 01:27

Thanks Carlo, it's a large N and small T.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#6

06 Jan 2017, 02:26

Chiraz:
I would go -xtreg. with -vce(cluster)-.

Kind regards,
Carlo
(Stata 19.0)
Comment
Dora King

Join Date: Nov 2018

Posts: 1
#7

29 Nov 2020, 09:22

Originally posted by Carlo Lazzaro View Post

Chiraz:
I would go -xtreg. with -vce(cluster)-.

this gives still OLS, not GLS
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#8

29 Nov 2020, 09:35

Dora:
welcome to this forum.
Can you please elaborate on your previois reply? Thanks.

Kind regards,
Carlo
(Stata 19.0)
Comment
Ale Zapata

Join Date: Jan 2021

Posts: 10
#9

02 Jan 2021, 08:16

Hi Carlo,
I was wondering whether using xtreg and vce(cluster) would still be an ols type regression instead of a FEGLS (fixed effect gls). Also, I have used xtreg and vce(cluster) as you suggested in the previous comments. Nevertheless, I still get heterogeneity with xttest3 and serial correlation with xtserial. I was wondering whether you would recommend other tests besides xttest3 and xtseria l( this is the only test I can use for serial correlation given my data).
Thank you in advance
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17707

#10

02 Jan 2021, 08:43

Ale:
welcome to this forum.
1) -xtreg,fe- introduces a more informative estimator for panel data regression with -fe- specification and continuous regressand. Please note that, as fa as shared coefficients only are concerned, you can get the same sample estimates with -regress- and -xtreg,fe-, as you can see from the following tiy-example (that said, my preference goes out to -xtreg,fe):

Code:

. use "https://www.stata-press.com/data/r16/nlswork.dta"
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

. regress ln_wage c.age##c.age i.year i.idcode if idcode<=3, vce(cluster idcode)

Linear regression                               Number of obs     =         39
                                                F(2, 2)           =          .
                                                Prob > F          =          .
                                                R-squared         =     0.8139
                                                Root MSE          =     .21943

                                 (Std. Err. adjusted for 3 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0773019   .0106911     7.23   0.019     .0313017    .1233021
             |
 c.age#c.age |  -.0045583    .002264    -2.01   0.182    -.0142995    .0051828
             |
        year |
         69  |   .3367906   .0914392     3.68   0.066    -.0566405    .7302218
         70  |   .2089384   .2867011     0.73   0.542    -1.024637    1.442514
         71  |   .3144116   .1619035     1.94   0.192     -.382203    1.011026
         72  |   .5888124   .4958888     1.19   0.357    -1.544825     2.72245
         73  |   .8912873   .5219448     1.71   0.230     -1.35446    3.137034
         75  |   1.246958   .6073839     2.05   0.176    -1.366404     3.86032
         77  |   1.560689   .8626802     1.81   0.212    -2.151125    5.272502
         78  |   1.941522   1.278416     1.52   0.268    -3.559059    7.442103
         80  |    2.34498   1.525965     1.54   0.264    -4.220718    8.910678
         82  |   2.698954   1.663018     1.62   0.246    -4.456435    9.854344
         83  |   2.994437    1.81452     1.65   0.241    -4.812813    10.80169
         85  |   3.538578   2.210833     1.60   0.251    -5.973868    13.05102
         87  |   3.965153   2.460506     1.61   0.248    -6.621548    14.55185
         88  |    4.40786   2.688929     1.64   0.243    -7.161667    15.97739
             |
      idcode |
          2  |  -.4183815   .0165036   -25.35   0.002    -.4893909   -.3473721
          3  |   .6579353   .7215294     0.91   0.458    -2.446555    3.762426
             |
       _cons |   1.341224   .1489003     9.01   0.012     .7005575     1.98189
------------------------------------------------------------------------------

. xtreg ln_wage c.age##c.age i.year if idcode<=3, fe vce(cluster idcode)

Fixed-effects (within) regression               Number of obs     =         39
Group variable: idcode                          Number of groups  =          3

R-sq:                                           Obs per group:
     within  = 0.7404                                         min =         12
     between = 0.4068                                         avg =       13.0
     overall = 0.4014                                         max =         15

                                                F(4,2)            =          .
corr(u_i, Xb)  = -0.8560                        Prob > F          =          .

                                 (Std. Err. adjusted for 3 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0773019   .0101936     7.58   0.017     .0334424    .1211613
             |
 c.age#c.age |  -.0045583   .0021586    -2.11   0.169    -.0138461    .0047294
             |
        year |
         69  |   .3367906   .0871839     3.86   0.061    -.0383313    .7119126
         70  |   .2089384   .2733588     0.76   0.525    -.9672295    1.385106
         71  |   .3144116   .1543689     2.04   0.179    -.3497843    .9786076
         72  |   .5888124   .4728115     1.25   0.339    -1.445531    2.623156
         73  |   .8912873   .4976548     1.79   0.215    -1.249948    3.032523
         75  |   1.246958   .5791178     2.15   0.164    -1.244785    3.738701
         77  |   1.560689   .8225333     1.90   0.198    -1.978387    5.099764
         78  |   1.941522   1.218922     1.59   0.252    -3.303077    7.186121
         80  |    2.34498   1.454951     1.61   0.248    -3.915167    8.605128
         82  |   2.698954   1.585626     1.70   0.231    -4.123442     9.52135
         83  |   2.994437   1.730077     1.73   0.226    -4.449484    10.43836
         85  |   3.538578   2.107946     1.68   0.235    -5.531183    12.60834
         87  |   3.965153      2.346     1.69   0.233     -6.12887    14.05918
         88  |    4.40786   2.563793     1.72   0.228    -6.623251    15.43897
             |
       _cons |   1.465543   .3990418     3.67   0.067    -.2513952    3.182481
-------------+----------------------------------------------------------------
     sigma_u |  .54258328
     sigma_e |  .21942548
         rho |  .85944136   (fraction of variance due to u_i)
------------------------------------------------------------------------------

.

2) re.-running heteroskedasticity and/or autocorrelation detecting tests after you've invoked non-default standard errors means wasting your time, as the non-default options affect the standard errors calculation, not the residuals: therefore, the tests will keep suggesting you to reject the null.

Kind regards,
Carlo
(Stata 19.0)

Comment

Jeff Wooldridge

Join Date: Apr 2014

Posts: 2167
#11

02 Jan 2021, 19:42

A few things.

1. As Carlo points out, one can always use fixed effects (or first differencing, for that matter) and use vce(cluster id) to obtain standard errors robust to serial correlation and heteroskedasticity.

2. However, one may be giving up too much efficiency. If the clustered standard errors of the FE estimator are "large," leading to wide confidence intervals, then one might want to try a GLS method.

3. I recommend using GLS after first differencing. This is asymptotically just as efficient as FEGLS but is easier to implement.

Code:

xtset id year xtgee D.(y x1 ... xK d2 ... dT), corr(uns) vce(robust)

This applies GEE, allowing for unrestricted correlations across time. Unfortunately, GEE imposes constant variance, so it is not full GLS. But it is what is easy to implement in Stata.

By the way, if the usual FE estimator is efficient, that will be uncovered by the FDGLS approach.
1 like
Comment
Ale Zapata

Join Date: Jan 2021

Posts: 10
#12

03 Jan 2021, 10:02

Dear Carlo and Jeff
Thank you for your quick respond.
First of all, since the vce(cluster) affects the standard error calculation how can I test that hetroskedasticity and serial correlation is resolved?
I thought to use first difference and GLS. Nevertheless, I have an unbalance data panel so I though that taking the first difference would bias the result. Also, I have tried xtregar but it can not test for heteroskedasticity and serial correlation (xttest2,xttest3,xtqptest,xtcsd seems not to work). What do you suggest?
Thank you in advance
It is quite difficult to contact my supervisor in these days so I am quite lost .
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#13

03 Jan 2021, 10:36

Ale:
1) as previously replied, you should consider heteroskedasticity and/or serial correlation resolved just invoking -vce(cluster panelid)- and do not check for those nuisances anymore;
2) Jeff's point 2. and 3. should have clarified whether, in your case, -vce(cluster panelid) (which is the simplest solution provided that the number of clusters is large enough) works well or a more demanding approach (1st differencing + GLS) is necessary;
3) -xtregar- was developed for T>N panel datasets: is it your case?

Kind regards,
Carlo
(Stata 19.0)
Comment
Ale Zapata

Join Date: Jan 2021

Posts: 10
#14

04 Jan 2021, 09:42

Dear Carlo,

Thank you for your answer.
I am working with more dateset based on the income level of the countries. Therefore, I have dateset where I have N>T and others where I have T>N. Therefore, I would use vce(cluster) when I have N>T and xtregar,fe when T>N. Is it correct? Would you recommend taking the first difference even if the panel data is unbalance(I used tsfill command )?

Thank you in advance
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2167
#15

05 Jan 2021, 09:54

In all cases I would assume N and T are of a similar magnitude; whether one is a bit bigger than the other is not relevant. You're in a macro-type setting if you're using country data with lots of years.

As long as N is never a lot smaller than T, I would try clustering by country after using fixed effects. I would also use xtscc (user written) to compute Newey-West standard errors. The estimation is the same: fixed effects, and you should include year effects two. It's two different ways of computing standard errors. xtscc allows for cross sectional correlation but imposes weak dependence in the T dimension.
1 like
Comment

Announcement