Panel data: testing for serialcorrelation and heteroskedasticity

Carlo Lazzaro replied

16 Sep 2017, 08:12

Niels:
before switching to -xtgls- despite havibng a large N, small T dataset,, please note the dramatically different times (in seconds) taken by -xtreg- and -xtgls- to perform the same simple panel data regression:

Code:

. set rmsg on
r; t=0.00 15:48:21

. xtreg ln_wage i.race, re

Random-effects GLS regression                   Number of obs     =     28,534
Group variable: idcode                          Number of groups  =      4,711

R-sq:                                           Obs per group:
     within  = 0.0000                                         min =          1
     between = 0.0198                                         avg =        6.1
     overall = 0.0186                                         max =         15

                                                Wald chi2(2)      =      99.02
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        race |
      black  |  -.1300382    .013486    -9.64   0.000    -.1564702   -.1036062
      other  |   .1011474   .0562889     1.80   0.072    -.0091768    .2114716
             |
       _cons |   1.691756   .0071865   235.41   0.000     1.677671    1.705841
-------------+----------------------------------------------------------------
     sigma_u |  .38195681
     sigma_e |  .32028665
         rho |  .58714668   (fraction of variance due to u_i)
------------------------------------------------------------------------------
r; t=0.61 15:48:28

. xtgls ln_wage i.race

Cross-sectional time-series FGLS regression

Coefficients:  generalized least squares
Panels:        homoskedastic
Correlation:   no autocorrelation

Estimated covariances      =         1          Number of obs     =     28,534
Estimated autocorrelations =         0          Number of groups  =      4,711
Estimated coefficients     =         3          Obs per group:
                                                              min =          1
                                                              avg =   6.056888
                                                              max =         15
                                                Wald chi2(2)      =     542.80
Log likelihood             =    -19162          Prob > chi2       =     0.0000

------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        race |
      black  |  -.1427862    .006243   -22.87   0.000    -.1550222   -.1305502
      other  |    .080671   .0274112     2.94   0.003      .026946     .134396
             |
       _cons |   1.714338   .0033339   514.21   0.000     1.707804    1.720873
------------------------------------------------------------------------------
r; t=692.49 16:00:07
.

A possible work-around could be:
-skipping -xttest2- and -xttest3-;
- graphically inspect your residual distribution;
-robustify/cluster your standard errors if you suspect that (especially) heteroskedasticity can bite your results (as said, serial correlation is expected to be a minor nuisance with a short T dimension).

Otherwise, as many econometricians usually do, go -cluster-/-robust- from scratch; with 200 -panelid- you have enough clusters to survive.

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: