Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Pooled OLS with Clustered Robust S.E vs Random Effects (Robust)?

    I am working on a project with a time invariant independent variable (childhood nutrition at age 5). I have so far been working with a pooled OLS model with clustered robust standard errors at the panel (Household) level as my dataset is panel in nature-with clustered S.Es to correct for serial autocorrelation of the errors within a cluster (household) across time). I have 2000 clusters with 5 observations per cluster.

    I cannot use a fixed effects due to the time invariant nature of my main independent variable. I was hoping to compare my results with pooled OLS with clustered robust standard errors with a random effects model. I was having some trouble understanding the relative benefits of doing so as from reading, my understanding is both models aim to correct for serial correlation and heteroscedasticity in errors across time. I have 2 specific questions:

    1) What are the benefits of Random Effects over Pooled OLS with clustered robust standard errors ? Do they correct for different phenomena? (Any resources i.e. links you might be able to provide would be useful here
    2) In STATA, what does the robust command do with random effects? Does not using robust with random effects correct for serial correlation anyway?


  • #2
    Thomas:
    without sharing what you typed and what Stata gave you back, it's difficult to get really helpful replies.
    That said, if you have a T>N panel dataset, you may want to consider -xtgls- and -xtregar-.
    As far as your questions are concerned:
    1) I would start from -regress-, -xtreg-, -xtgls- and -xtregar- entries (and related references) in Stata .pdf manual;
    2) Under regress-, the -robust- option takes heteroskedasticity only into account; under -xtreg- (with -fe- or -re- specification) both -robust- and -cluster- option do the very same job (ie, cluster-robust standard error, that takes heteroskedasticity and/or autocorrelation into account). -xt- commands for T>N panel datasets (like the one you're dealing with) have more sophisticated options to model the error term.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Carlo- thank you for your reply. I will try to post my code in a separate reply. Prior to this:

      Could you clarify what T>N refers to in this instance? To clarify, I have a panel survey with 5 rounds , with 2000 households. As I am clustering at the household (panel) level, does that mean that N in this instance is 2000? Or is that T?

      Secondly, under -regress- with the add on (cluster panel) I understand this takes into account both heteroskedasticity and serial correlation across clusters (as opposed to the basic robust command, which as you mentioned only takes into account the former. As it appears the robust command in xtreg, re, also controls for both of these elements , is there any (general rule) regarding the relative benefits of either approach (pooled OLS with robust clustered standard errors at the panel level vs Random Effects with robust?) Or should I just run both approaches and compare them?

      Comment


      • #4
        Thomas:
        you're correct. You have a N>T panel datasets; hence, among the -xt- commands, if your regressand is continuous, -xtreg- is the way to go.
        Moreover, is right to cluster your standard error at the -panelid-.
        If you have heteroskedasticity and serial correlation after -regress-, it's wise to invoke the -cluster- option; -robust- works fine if you have heteroskedasticity ony.
        Under -xtreg- both options do the very same job.
        Whenever I have a panel dataset with a continuous regressand, I consider .xtreg- only, as it can provide me with more details as far as the results are concerned (ie, -sigma_e- and -sigma_u- statistics).
        Eventually, -hausman- (if you have default standard errors) or the community-contributed programm -xtoverid- can point you to -fe- or -re- specification.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Thanks Carlo, that's very helpful. The one last question I have is that as my regressand is time invariant but continuous, I understand it would not be possible to do the Hausman test of fe vs re ?

          Comment


          • #6
            Thomas:
            if your continuous regressand is time-invariant there's no way to make -xtreg,fe- works.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Thanks Carlo. Actually one other question I had is with panel random effects, what is the difference in interpretation between not using robust and using robust (i.e. xtreg re Vs xtreg re robust)? Without adding a robust , does random effects not already correct for serial correlation ?

              Comment


              • #8
                Thomas:
                without robust or cluster (ie, cluster-robust) standard error, -xtreg- does not take heteroskedasticity and/or autocorrelation of the systematic error into account.
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Should results using Pooled OLS with clustered standard errors (reg vce(cluster panelid) at the panel level differ from a random effects model with robust (xtreg re robust)? I can post my data which seems to be showing this is the case

                  Comment


                  • #10
                    Thomas:
                    this is expected. As you can see from the following toy-example, the pooled OLS point estimates overlap those from -xtreg,fe- (but not those from -xtreg,re-) as fra as the shared coefficients are concerned:
                    Code:
                    . use "https://www.stata-press.com/data/r16/nlswork.dta"
                    (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
                    
                    . reg ln_wage i.idcode i.year c.age##c.age if idcode<=4, vce(cluster idcode)
                    
                    Linear regression                               Number of obs     =         50
                                                                    F(2, 3)           =          .
                                                                    Prob > F          =          .
                                                                    R-squared         =     0.6590
                                                                    Root MSE          =     .28764
                    
                                                     (Std. Err. adjusted for 4 clusters in idcode)
                    ------------------------------------------------------------------------------
                                 |               Robust
                         ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                          idcode |
                              2  |   -.394374   .0296125   -13.32   0.001    -.4886143   -.3001337
                              3  |   .1010589   1.385597     0.07   0.946     -4.30853    4.510648
                              4  |   .5522662   1.411066     0.39   0.722    -3.938377    5.042909
                                 |
                            year |
                             69  |    .224994   .2362011     0.95   0.411    -.5267035    .9766914
                             70  |   .1649135   .4084011     0.40   0.713    -1.134801    1.464628
                             71  |   .1712431   .5615442     0.30   0.780    -1.615841    1.958327
                             72  |   .3136266   .8258606     0.38   0.729     -2.31463    2.941883
                             73  |   .4386164   1.160727     0.38   0.731    -3.255333    4.132566
                             75  |    .575746   1.642352     0.35   0.749     -4.65095    5.802442
                             77  |    .651904   2.091338     0.31   0.776    -6.003666    7.307474
                             78  |   .9314592   2.431102     0.38   0.727    -6.805391     8.66831
                             80  |   .9700889   3.035563     0.32   0.770    -8.690428    10.63061
                             82  |   1.063147   3.462289     0.31   0.779    -9.955403     12.0817
                             83  |   1.379563   3.652919     0.38   0.731    -10.24566    13.00478
                             85  |   1.854813   3.994786     0.46   0.674    -10.85838      14.568
                             87  |   2.153058   4.439545     0.48   0.661    -11.97556    16.28167
                             88  |   2.571633   4.744555     0.54   0.625    -12.52766    17.67092
                                 |
                             age |   .2364322   .2603253     0.91   0.431     -.592039    1.064903
                                 |
                     c.age#c.age |  -.0056102   .0021351    -2.63   0.078    -.0124052    .0011848
                                 |
                           _cons |  -1.093667   4.089099    -0.27   0.806    -14.10701    11.91967
                    ------------------------------------------------------------------------------
                    
                    
                    . xtreg ln_wage i.year c.age##c.age if idcode<=4, fe vce(cluster idcode)
                    
                    Fixed-effects (within) regression               Number of obs     =         50
                    Group variable: idcode                          Number of groups  =          4
                    
                    R-sq:                                           Obs per group:
                         within  = 0.5342                                         min =         11
                         between = 0.0151                                         avg =       12.5
                         overall = 0.2227                                         max =         15
                    
                                                                    F(3,3)            =          .
                    corr(u_i, Xb)  = -0.6249                        Prob > F          =          .
                    
                                                     (Std. Err. adjusted for 4 clusters in idcode)
                    ------------------------------------------------------------------------------
                                 |               Robust
                         ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                            year |
                             69  |    .224994   .2252089     1.00   0.391    -.4917214    .9417093
                             70  |   .1649135   .3893952     0.42   0.700    -1.074316    1.404143
                             71  |   .1712431   .5354114     0.32   0.770    -1.532675    1.875161
                             72  |   .3136266   .7874271     0.40   0.717    -2.192318    2.819571
                             73  |   .4386164   1.106709     0.40   0.718    -3.083427    3.960659
                             75  |    .575746   1.565921     0.37   0.738    -4.407713    5.559205
                             77  |    .651904   1.994012     0.33   0.765    -5.693933    6.997741
                             78  |   .9314592   2.317964     0.40   0.715    -6.445338    8.308257
                             80  |   .9700889   2.894296     0.34   0.760    -8.240852    10.18103
                             82  |   1.063147   3.301163     0.32   0.769    -9.442628    11.56892
                             83  |   1.379563   3.482922     0.40   0.719    -9.704648    12.46377
                             85  |   1.854813   3.808879     0.49   0.660    -10.26674    13.97637
                             87  |   2.153058    4.23294     0.51   0.646    -11.31805    15.62416
                             88  |   2.571633   4.523756     0.57   0.609    -11.82498    16.96824
                                 |
                             age |   .2364322   .2482104     0.95   0.411    -.5534841    1.026349
                                 |
                     c.age#c.age |  -.0056102   .0020358    -2.76   0.070     -.012089    .0008685
                                 |
                           _cons |  -1.036501   4.526557    -0.23   0.834    -15.44202    13.36902
                    -------------+----------------------------------------------------------------
                         sigma_u |  .38900632
                         sigma_e |  .28764391
                             rho |  .64651254   (fraction of variance due to u_i)
                    ------------------------------------------------------------------------------
                    
                    .
                    
                    . xtreg ln_wage i.year c.age##c.age if idcode<=4, vce(cluster idcode)
                    
                    Random-effects GLS regression                   Number of obs     =         50
                    Group variable: idcode                          Number of groups  =          4
                    
                    R-sq:                                           Obs per group:
                         within  = 0.5077                                         min =         11
                         between = 0.0107                                         avg =       12.5
                         overall = 0.3789                                         max =         15
                    
                                                                    Wald chi2(3)      =          .
                    corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =          .
                    
                                                     (Std. Err. adjusted for 4 clusters in idcode)
                    ------------------------------------------------------------------------------
                                 |               Robust
                         ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                    -------------+----------------------------------------------------------------
                            year |
                             69  |   .1486743   .0549624     2.71   0.007     .0409499    .2563986
                             70  |    .277369   .2887149     0.96   0.337    -.2885019    .8432399
                             71  |   .1301287   .2621698     0.50   0.620    -.3837146     .643972
                             72  |   .1961925   .2197171     0.89   0.372     -.234445    .6268301
                             73  |   .2437975   .1144552     2.13   0.033     .0194694    .4681255
                             75  |   .2229619   .0470414     4.74   0.000     .1307625    .3151613
                             77  |   .0565625   .2272149     0.25   0.803    -.3887705    .5018956
                             78  |   .2544722   .4275434     0.60   0.552    -.5834974    1.092442
                             80  |   .2037513    .589187     0.35   0.729    -.9510341    1.358537
                             82  |   .0580069   .6546599     0.09   0.929    -1.225103    1.341117
                             83  |   .3523115   .6776646     0.52   0.603    -.9758868     1.68051
                             85  |   .6482934   .7762536     0.84   0.404    -.8731357    2.169722
                             87  |   .7630092   .8329341     0.92   0.360    -.8695117     2.39553
                             88  |     1.0182    .886308     1.15   0.251    -.7189321    2.755331
                                 |
                             age |   .2887857    .133158     2.17   0.030     .0278008    .5497707
                                 |
                     c.age#c.age |  -.0050776   .0021492    -2.36   0.018      -.00929   -.0008652
                                 |
                           _cons |  -2.402156   1.992439    -1.21   0.228    -6.307264    1.502953
                    -------------+----------------------------------------------------------------
                         sigma_u |          0
                         sigma_e |  .28764391
                             rho |          0   (fraction of variance due to u_i)
                    ------------------------------------------------------------------------------
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment

                    Working...
                    X