Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fixed effects GLS (FEGLS)

    Hi,
    Please I want to know wich command of STATA permits to estimate FEGLS (fixed effects GLS) as in Wooldridge 2002? Is it vce(cluster)? is it xtregar?
    When it is a random effects model, it's easy, we use xtgls with the right variance structure and it's done. However when it is a fixed effect model, how to correct simultaneously for autocorrelation and heteroscedasticity in both within and between dimensions?
    Thanks a lot, I really need clarifications.

  • #2
    Chiraz:
    if you're dealing with a large N, short T panel dataset, -vce(cluster)- can handle both autocorrelation and heteroskedasticity,
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hi Carlo,
      First happy new year.
      Thank you for responding. I know that in STATA when we have both heteroscedasticity and autocorrelation we have to use vce(cluster). If we have just autocorrelation we use xtregar, with simply heteroscdasticitty we use robust.
      Sorry if I didn't well express my self I'm from Tunisia so not an English native. My question is more theoritical. Wooldridge (2002, p277) explained the fixed effects GLS procedure which consists in estimating FE and then take the residuals, drop an observation, estimate the variance etc.... Is that what the command vce(cluster) really do? what's the theoritical background under this command?can we say in a paper that we are using a generalized fixed effect when we correct problems with vce(cluster)?
      Thank you for your help, I'm giving a course on this and I really don't want to give wrong explanations to my students.

      Comment


      • #4
        Chiraz:
        again, the main question is if you're talking about a large N, small T panel dataset or the other way round.
        As an aside, please note that, under -xt-, -vce(robust)- and -vce(cluster)- do the same jobs.
        As far as -xtregar. is concerned, it seems more suitable for small N, large T panel dataset.
        I reciprocate all the best for 2017.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Thanks Carlo, it's a large N and small T.

          Comment


          • #6
            Chiraz:
            I would go -xtreg. with -vce(cluster)-.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Originally posted by Carlo Lazzaro View Post
              Chiraz:
              I would go -xtreg. with -vce(cluster)-.
              this gives still OLS, not GLS

              Comment


              • #8
                Dora:
                welcome to this forum.
                Can you please elaborate on your previois reply? Thanks.
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Hi Carlo,
                  I was wondering whether using xtreg and vce(cluster) would still be an ols type regression instead of a FEGLS (fixed effect gls). Also, I have used xtreg and vce(cluster) as you suggested in the previous comments. Nevertheless, I still get heterogeneity with xttest3 and serial correlation with xtserial. I was wondering whether you would recommend other tests besides xttest3 and xtseria l( this is the only test I can use for serial correlation given my data).
                  Thank you in advance

                  Comment


                  • #10
                    Ale:
                    welcome to this forum.
                    1) -xtreg,fe- introduces a more informative estimator for panel data regression with -fe- specification and continuous regressand. Please note that, as fa as shared coefficients only are concerned, you can get the same sample estimates with -regress- and -xtreg,fe-, as you can see from the following tiy-example (that said, my preference goes out to -xtreg,fe):
                    Code:
                    . use "https://www.stata-press.com/data/r16/nlswork.dta"
                    (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
                    
                    . regress ln_wage c.age##c.age i.year i.idcode if idcode<=3, vce(cluster idcode)
                    
                    Linear regression                               Number of obs     =         39
                                                                    F(2, 2)           =          .
                                                                    Prob > F          =          .
                                                                    R-squared         =     0.8139
                                                                    Root MSE          =     .21943
                    
                                                     (Std. Err. adjusted for 3 clusters in idcode)
                    ------------------------------------------------------------------------------
                                 |               Robust
                         ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                             age |   .0773019   .0106911     7.23   0.019     .0313017    .1233021
                                 |
                     c.age#c.age |  -.0045583    .002264    -2.01   0.182    -.0142995    .0051828
                                 |
                            year |
                             69  |   .3367906   .0914392     3.68   0.066    -.0566405    .7302218
                             70  |   .2089384   .2867011     0.73   0.542    -1.024637    1.442514
                             71  |   .3144116   .1619035     1.94   0.192     -.382203    1.011026
                             72  |   .5888124   .4958888     1.19   0.357    -1.544825     2.72245
                             73  |   .8912873   .5219448     1.71   0.230     -1.35446    3.137034
                             75  |   1.246958   .6073839     2.05   0.176    -1.366404     3.86032
                             77  |   1.560689   .8626802     1.81   0.212    -2.151125    5.272502
                             78  |   1.941522   1.278416     1.52   0.268    -3.559059    7.442103
                             80  |    2.34498   1.525965     1.54   0.264    -4.220718    8.910678
                             82  |   2.698954   1.663018     1.62   0.246    -4.456435    9.854344
                             83  |   2.994437    1.81452     1.65   0.241    -4.812813    10.80169
                             85  |   3.538578   2.210833     1.60   0.251    -5.973868    13.05102
                             87  |   3.965153   2.460506     1.61   0.248    -6.621548    14.55185
                             88  |    4.40786   2.688929     1.64   0.243    -7.161667    15.97739
                                 |
                          idcode |
                              2  |  -.4183815   .0165036   -25.35   0.002    -.4893909   -.3473721
                              3  |   .6579353   .7215294     0.91   0.458    -2.446555    3.762426
                                 |
                           _cons |   1.341224   .1489003     9.01   0.012     .7005575     1.98189
                    ------------------------------------------------------------------------------
                    
                    . xtreg ln_wage c.age##c.age i.year if idcode<=3, fe vce(cluster idcode)
                    
                    Fixed-effects (within) regression               Number of obs     =         39
                    Group variable: idcode                          Number of groups  =          3
                    
                    R-sq:                                           Obs per group:
                         within  = 0.7404                                         min =         12
                         between = 0.4068                                         avg =       13.0
                         overall = 0.4014                                         max =         15
                    
                                                                    F(4,2)            =          .
                    corr(u_i, Xb)  = -0.8560                        Prob > F          =          .
                    
                                                     (Std. Err. adjusted for 3 clusters in idcode)
                    ------------------------------------------------------------------------------
                                 |               Robust
                         ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                             age |   .0773019   .0101936     7.58   0.017     .0334424    .1211613
                                 |
                     c.age#c.age |  -.0045583   .0021586    -2.11   0.169    -.0138461    .0047294
                                 |
                            year |
                             69  |   .3367906   .0871839     3.86   0.061    -.0383313    .7119126
                             70  |   .2089384   .2733588     0.76   0.525    -.9672295    1.385106
                             71  |   .3144116   .1543689     2.04   0.179    -.3497843    .9786076
                             72  |   .5888124   .4728115     1.25   0.339    -1.445531    2.623156
                             73  |   .8912873   .4976548     1.79   0.215    -1.249948    3.032523
                             75  |   1.246958   .5791178     2.15   0.164    -1.244785    3.738701
                             77  |   1.560689   .8225333     1.90   0.198    -1.978387    5.099764
                             78  |   1.941522   1.218922     1.59   0.252    -3.303077    7.186121
                             80  |    2.34498   1.454951     1.61   0.248    -3.915167    8.605128
                             82  |   2.698954   1.585626     1.70   0.231    -4.123442     9.52135
                             83  |   2.994437   1.730077     1.73   0.226    -4.449484    10.43836
                             85  |   3.538578   2.107946     1.68   0.235    -5.531183    12.60834
                             87  |   3.965153      2.346     1.69   0.233     -6.12887    14.05918
                             88  |    4.40786   2.563793     1.72   0.228    -6.623251    15.43897
                                 |
                           _cons |   1.465543   .3990418     3.67   0.067    -.2513952    3.182481
                    -------------+----------------------------------------------------------------
                         sigma_u |  .54258328
                         sigma_e |  .21942548
                             rho |  .85944136   (fraction of variance due to u_i)
                    ------------------------------------------------------------------------------
                    
                    .
                    2) re.-running heteroskedasticity and/or autocorrelation detecting tests after you've invoked non-default standard errors means wasting your time, as the non-default options affect the standard errors calculation, not the residuals: therefore, the tests will keep suggesting you to reject the null.
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment


                    • #11
                      A few things.

                      1. As Carlo points out, one can always use fixed effects (or first differencing, for that matter) and use vce(cluster id) to obtain standard errors robust to serial correlation and heteroskedasticity.

                      2. However, one may be giving up too much efficiency. If the clustered standard errors of the FE estimator are "large," leading to wide confidence intervals, then one might want to try a GLS method.

                      3. I recommend using GLS after first differencing. This is asymptotically just as efficient as FEGLS but is easier to implement.

                      Code:
                      xtset id year
                      xtgee D.(y x1 ... xK d2 ... dT), corr(uns) vce(robust)
                      This applies GEE, allowing for unrestricted correlations across time. Unfortunately, GEE imposes constant variance, so it is not full GLS. But it is what is easy to implement in Stata.

                      By the way, if the usual FE estimator is efficient, that will be uncovered by the FDGLS approach.

                      Comment


                      • #12
                        Dear Carlo and Jeff
                        Thank you for your quick respond.
                        First of all, since the vce(cluster) affects the standard error calculation how can I test that hetroskedasticity and serial correlation is resolved?
                        I thought to use first difference and GLS. Nevertheless, I have an unbalance data panel so I though that taking the first difference would bias the result. Also, I have tried xtregar but it can not test for heteroskedasticity and serial correlation (xttest2,xttest3,xtqptest,xtcsd seems not to work). What do you suggest?
                        Thank you in advance
                        It is quite difficult to contact my supervisor in these days so I am quite lost .

                        Comment


                        • #13
                          Ale:
                          1) as previously replied, you should consider heteroskedasticity and/or serial correlation resolved just invoking -vce(cluster panelid)- and do not check for those nuisances anymore;
                          2) Jeff's point 2. and 3. should have clarified whether, in your case, -vce(cluster panelid) (which is the simplest solution provided that the number of clusters is large enough) works well or a more demanding approach (1st differencing + GLS) is necessary;
                          3) -xtregar- was developed for T>N panel datasets: is it your case?
                          Kind regards,
                          Carlo
                          (Stata 19.0)

                          Comment


                          • #14
                            Dear Carlo,

                            Thank you for your answer.
                            I am working with more dateset based on the income level of the countries. Therefore, I have dateset where I have N>T and others where I have T>N. Therefore, I would use vce(cluster) when I have N>T and xtregar,fe when T>N. Is it correct? Would you recommend taking the first difference even if the panel data is unbalance(I used tsfill command )?

                            Thank you in advance

                            Comment


                            • #15
                              In all cases I would assume N and T are of a similar magnitude; whether one is a bit bigger than the other is not relevant. You're in a macro-type setting if you're using country data with lots of years.

                              As long as N is never a lot smaller than T, I would try clustering by country after using fixed effects. I would also use xtscc (user written) to compute Newey-West standard errors. The estimation is the same: fixed effects, and you should include year effects two. It's two different ways of computing standard errors. xtscc allows for cross sectional correlation but imposes weak dependence in the T dimension.

                              Comment

                              Working...
                              X