xtreg fe versus areg robust errors?

Jay Mehalek

Join Date: Feb 2021
Posts: 7

xtreg fe versus areg robust errors?

16 Mar 2021, 20:50

Hello! I apologize if some of the questions in this post seem simple, but this has to do with my thesis and I appreciate any assistance!

I am dealing with panel data involving some banks over 63 time periods(N=4736 T=63) balanced. I used the Hausman test to determine I needed to use a FE model. But, this is where the trouble begins and my questions begin! So I run a basic xtreg fe model and get this result

Code:

xtreg zscore lnasset lnassetsq diverse leverage eeffqr DGS10 CPIAUCSL_PCH GDPC1_PC1, fe

Fixed-effects (within) regression               Number of obs     =    298,355
Group variable: cert                            Number of groups  =      4,736

R-sq:                                           Obs per group:
     within  = 0.0179                                         min =         62
     between = 0.0242                                         avg =       63.0
     overall = 0.0227                                         max =         63

                                                F(8,293611)       =     670.17
corr(u_i, Xb)  = -0.0196                        Prob > F          =     0.0000

------------------------------------------------------------------------------
      zscore |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     lnasset |   .4193168   .0420687     9.97   0.000     .3368634    .5017702
   lnassetsq |  -.0071421   .0016294    -4.38   0.000    -.0103357   -.0039485
     diverse |   .0000122    .000058     0.21   0.834    -.0001015    .0001259
    leverage |   -.013206   .0004068   -32.46   0.000    -.0140033   -.0124087
      eeffqr |  -.0001518   8.33e-06   -18.23   0.000    -.0001681   -.0001355
       DGS10 |   .0478444   .0021882    21.86   0.000     .0435557    .0521332
CPIAUCSL_PCH |   .0676272   .0032808    20.61   0.000     .0611969    .0740575
   GDPC1_PC1 |   .0310941   .0008967    34.68   0.000     .0293366    .0328516
       _cons |  -1.795671   .2740217    -6.55   0.000    -2.332746   -1.258596
-------------+----------------------------------------------------------------
     sigma_u |  1.7745564
     sigma_e |  .99459682
         rho |  .76095759   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(4735, 293611) = 196.07              Prob > F = 0.0000

so now I go about running some test to check for serial correlation and heteroscedasticity.

Code:

xttest3

Modified Wald test for groupwise heteroskedasticity
in fixed effect regression model

H0: sigma(i)^2 = sigma^2 for all i

chi2 (4736)  =   379.43
Prob>chi2 =      1.0000

Which if I interpret this correctly means my model doesn't suffer from hetero.
Now running xtserial I get

Code:

xtserial zscore lnasset lnassetsq diverse leverage eeffqr DGS10 CPIAUCSL_PCH GDPC1_PC1

Wooldridge test for autocorrelation in panel data
H0: no first-order autocorrelation
    F(  1,    4735) =    226.237
           Prob > F =      0.0000

Meaning my model does suffer from autocorrelation. So what now with this information do I run a vce(robust) fe or is there some other tests I should run or other model or option I should use. I am outside my statistical chops currently but I am trying to learn.

Secondly if I were to use a VCE(Robust) model why when I run it using areg as such do I get such a different significance on some of my variables than using xtreg. It was my impression they were so similar that they should not differ by much?
Results from areg note that cert is just a unique identifier for each individual bank:

Code:

areg zscore lnasset lnassetsq diverse leverage eeffqr DGS10 CPIAUCSL_PCH GDPC1_PC1, a(cert) vce(robust)

Linear regression, absorbing indicators         Number of obs     =    298,355
Absorbed variable: cert                         No. of categories =      4,736
                                                F(   8, 293611)   =     343.16
                                                Prob > F          =     0.0000
                                                R-squared         =     0.7691
                                                Adj R-squared     =     0.7654
                                                Root MSE          =     0.9946

------------------------------------------------------------------------------
             |               Robust
      zscore |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     lnasset |   .4193168   .0686047     6.11   0.000     .2848535    .5537801
   lnassetsq |  -.0071421   .0025061    -2.85   0.004    -.0120541   -.0022301
     diverse |   .0000122   .0000432     0.28   0.778    -.0000725    .0000969
    leverage |   -.013206   .0102436    -1.29   0.197    -.0332831    .0068711
      eeffqr |  -.0001518   .0001045    -1.45   0.146    -.0003566     .000053
       DGS10 |   .0478444   .0046113    10.38   0.000     .0388064    .0568824
CPIAUCSL_PCH |   .0676272   .0041492    16.30   0.000     .0594948    .0757596
   GDPC1_PC1 |   .0310941   .0011823    26.30   0.000     .0287769    .0334113
       _cons |  -1.795671   .4056216    -4.43   0.000    -2.590678   -1.000664
------------------------------------------------------------------------------

Now using a fe model:

Code:

xtreg zscore lnasset lnassetsq diverse leverage eeffqr DGS10 CPIAUCSL_PCH GDPC1_PC1, fe vce(robust)

Fixed-effects (within) regression               Number of obs     =    298,355
Group variable: cert                            Number of groups  =      4,736

R-sq:                                           Obs per group:
     within  = 0.0179                                         min =         62
     between = 0.0242                                         avg =       63.0
     overall = 0.0227                                         max =         63

                                                F(8,4735)         =     184.31
corr(u_i, Xb)  = -0.0196                        Prob > F          =     0.0000

                               (Std. Err. adjusted for 4,736 clusters in cert)
------------------------------------------------------------------------------
             |               Robust
      zscore |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     lnasset |   .4193168    .134761     3.11   0.002     .1551224    .6835111
   lnassetsq |  -.0071421   .0051472    -1.39   0.165    -.0172331    .0029488
     diverse |   .0000122   .0000445     0.27   0.784     -.000075    .0000994
    leverage |   -.013206   .0102906    -1.28   0.199    -.0333804    .0069684
      eeffqr |  -.0001518   .0001059    -1.43   0.152    -.0003594    .0000558
       DGS10 |   .0478444   .0067429     7.10   0.000     .0346253    .0610636
CPIAUCSL_PCH |   .0676272   .0037723    17.93   0.000     .0602317    .0750227
   GDPC1_PC1 |   .0310941   .0014933    20.82   0.000     .0281667    .0340216
       _cons |  -1.795671   .8540467    -2.10   0.036        -3.47   -.1213422
-------------+----------------------------------------------------------------
     sigma_u |  1.7745564
     sigma_e |  .99459682
         rho |  .76095759   (fraction of variance due to u_i)
------------------------------------------------------------------------------

I do not understand exactly why in particular lnassetsq became so insignificant in the FE model with robust errors but not in the areg? I am sorry if I am missing something elementary here.

Last edited by Jay Mehalek; 16 Mar 2021, 21:08.

Tags: areg vs xtreg, fixed effects, panel, panel data, regression

Andrew Musau

Join Date: Oct 2014

Posts: 10213
#2

17 Mar 2021, 00:05

The default for areg specifying the option -robust- is to implement Huber-White standard errors whereas for xtreg, it is cluster-robust standard errors where the panel identifier is the clustering variable. Refer to the manual entry of xtreg for literature discussing why the former is not appropriate for panel data. Therefore, you need to specify your areg command as:

Code:

areg zscore lnasset lnassetsq diverse leverage eeffqr DGS10 CPIAUCSL_PCH GDPC1_PC1, a(cert) cluster(cert)
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#3

17 Mar 2021, 01:49

Jay:
as an aside to Andrew's helpful advice, you should consider including -time-effect in the right-hand side of your regression equation (as a categorical or a continuos variable, the latter if you're goiing to investigate possible turning-points).
Besides, I would also check whether your model suffers from misspecification.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment

Jay Mehalek

Join Date: Feb 2021
Posts: 7

17 Mar 2021, 11:51

Hello Carlo,

You told me I should consider testing model specification and adding a time variable to the right-hand side of my equation. So I have two follow-up questions: how would I approach testing for model specification? Resetxt gives me too many values error. Now when I include i.time on the right-hand side of my equation I get the significance back with my expected coefficients but I get this:
Code:

Code:

. xtreg zscore lnasset lnassetsq diverse leverage eeffqr DGS10C CPIAUCSL_PCHC 
> gdpgrowth i.time, fe vce(robust)
note: 61.time omitted because of collinearity
note: 62.time omitted because of collinearity
note: 63.time omitted because of collinearity

Fixed-effects (within) regression               Number of obs     =    298,355
Group variable: cert                            Number of groups  =      4,736

R-sq:                                           Obs per group:
     within  = 0.0796                                         min =         62
     between = 0.0333                                         avg =       63.0
     overall = 0.0429                                         max =         63

                                                F(67,4735)        =     144.60
corr(u_i, Xb)  = 0.0342                         Prob > F          =     0.0000

                               (Std. Err. adjusted for 4,736 clusters in cert)
------------------------------------------------------------------------------
             |               Robust
      zscore |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     lnasset |   .5973538   .1325858     4.51   0.000     .3374239    .8572837
   lnassetsq |  -.0165323   .0049245    -3.36   0.001    -.0261866   -.0068781
     diverse |   7.57e-06   .0000394     0.19   0.848    -.0000697    .0000848
    leverage |  -.0116178   .0089223    -1.30   0.193    -.0291095     .005874
      eeffqr |  -.0001417   .0000997    -1.42   0.155    -.0003372    .0000538
      DGS10C |   .1030249   .0112794     9.13   0.000      .080912    .1251378
CPIAUCSL_P~C |   -.076835   .0488219    -1.57   0.116    -.1725488    .0188787
   gdpgrowth |   .0060696    .001485     4.09   0.000     .0031583    .0089809
             |
        time |
          2  |   .1663595   .0401901     4.14   0.000     .0875683    .2451508
          3  |   .0968155   .0175929     5.50   0.000     .0623253    .1313057
          4  |     -.2696   .0260467   -10.35   0.000    -.3206637   -.2185363
          5  |  -.1124687    .019199    -5.86   0.000    -.1501078   -.0748297
          6  |   .0143425   .0212968     0.67   0.501    -.0274091    .0560941
          7  |  -.0943285   .0622068    -1.52   0.129    -.2162828    .0276258
          8  |   -.286841   .0253616   -11.31   0.000    -.3365614   -.2371205
          9  |  -.2291939   .0262038    -8.75   0.000    -.2805655   -.1778223
         10  |  -.1239982   .0223498    -5.55   0.000    -.1678143   -.0801821
         11  |   -.005758   .0310194    -0.19   0.853    -.0665706    .0550545
         12  |  -.2973506      .0344    -8.64   0.000    -.3647906   -.2299105
         13  |  -.1336273   .0355004    -3.76   0.000    -.2032246     -.06403
         14  |  -.1546869    .052413    -2.95   0.003    -.2574407    -.051933
         15  |  -.6071432   .1259992    -4.82   0.000    -.8541602   -.3601261
         16  |   -.945345   .0545972   -17.31   0.000    -1.052381   -.8383091
         17  |  -.4607318    .018483   -24.93   0.000    -.4969672   -.4244965
         18  |  -.7826127    .025444   -30.76   0.000    -.8324947   -.7327307
         19  |  -.6448614   .0227605   -28.33   0.000    -.6894825   -.6002402
         20  |   -1.18486   .0352272   -33.63   0.000    -1.253922   -1.115798
         21  |  -.5955066     .03892   -15.30   0.000    -.6718079   -.5192054
         22  |  -.4392185   .0224526   -19.56   0.000     -.483236    -.395201
         23  |  -.3158391   .0232259   -13.60   0.000    -.3613727   -.2703055
         24  |  -.7708343   .0343497   -22.44   0.000    -.8381757   -.7034928
         25  |  -.4544725   .0305728   -14.87   0.000    -.5144094   -.3945355
         26  |  -.2639938    .020654   -12.78   0.000    -.3044852   -.2235024
         27  |  -.1026381   .0183024    -5.61   0.000    -.1385193    -.066757
         28  |  -.4982814   .0208304   -23.92   0.000    -.5391188   -.4574441
         29  |  -.1532437   .0190782    -8.03   0.000    -.1906459   -.1158415
         30  |  -.0993349   .0180403    -5.51   0.000    -.1347022   -.0639676
         31  |  -.0107705   .0218292    -0.49   0.622    -.0535659     .032025
         32  |  -.3782891   .0197427   -19.16   0.000     -.416994   -.3395843
         33  |  -.3275966   .0290952   -11.26   0.000    -.3846367   -.2705565
         34  |  -.2379256   .0150507   -15.81   0.000    -.2674319   -.2084193
         35  |  -.2807777   .0184106   -15.25   0.000    -.3168711   -.2446844
         36  |  -.4549801   .0190179   -23.92   0.000    -.4922641   -.4176962
         37  |  -.4021503   .0149412   -26.92   0.000    -.4314421   -.3728585
         38  |  -.2627981   .0224138   -11.72   0.000    -.3067396   -.2188565
         39  |   -.193605   .0391885    -4.94   0.000    -.2704326   -.1167774
         40  |  -.4249193    .058925    -7.21   0.000    -.5404396    -.309399
         41  |  -.2400543   .0158437   -15.15   0.000    -.2711152   -.2089933
         42  |   -.197201   .0144061   -13.69   0.000    -.2254436   -.1689584
         43  |  -.1579058   .0259732    -6.08   0.000    -.2088253   -.1069863
         44  |  -.3872782    .030056   -12.89   0.000     -.446202   -.3283544
         45  |  -.1674726   .0235227    -7.12   0.000     -.213588   -.1213573
         46  |  -.1035471   .0134319    -7.71   0.000    -.1298798   -.0772144
         47  |  -.1140232   .0151425    -7.53   0.000    -.1437096   -.0843367
         48  |  -.3888785   .0175693   -22.13   0.000    -.4233225   -.3544345
         49  |  -.3110916   .0206298   -15.08   0.000    -.3515357   -.2706475
         50  |  -.1652558   .0146101   -11.31   0.000    -.1938985   -.1366132
         51  |  -.0570616   .0177773    -3.21   0.001    -.0919133   -.0222098
         52  |  -.7346338   .0245985   -29.86   0.000    -.7828583   -.6864093
         53  |  -.1144805   .0171728    -6.67   0.000    -.1481471   -.0808139
         54  |   .0227917   .0206306     1.10   0.269    -.0176539    .0632373
         55  |   .1122881   .0213554     5.26   0.000     .0704216    .1541546
         56  |  -.1233688      .0286    -4.31   0.000     -.179438   -.0672996
         57  |   .0472902   .0238965     1.98   0.048     .0004419    .0941385
         58  |   .1541965    .016675     9.25   0.000     .1215058    .1868871
         59  |   .2720043   .0178643    15.23   0.000     .2369819    .3070266
         60  |   .0045996   .0167165     0.28   0.783    -.0281725    .0373717
         61  |          0  (omitted)
         62  |          0  (omitted)
         63  |          0  (omitted)
             |
       _cons |  -2.365842   .8712137    -2.72   0.007    -4.073826   -.6578575
-------------+----------------------------------------------------------------
     sigma_u |  1.7684947
     sigma_e |  .96293713
         rho |  .77132213   (fraction of variance due to u_i)
------------------------------------------------------------------------------

The problem with this is that the last 3 periods 61-62-63 are important in my research, and them being omitted is not acceptable. Is there any way you would recommend I approach this? thanks for any help at all. Is it possible I could not include robust errors? as previously mentioned I believe the model doesn't suffer from hetero but some autocorrelation?

Last edited by Jay Mehalek; 17 Mar 2021, 11:54. Reason: Code did not enter correctly

Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17712

17 Mar 2021, 12:06

Jay:
you can decide the reference year yourself via -fvvarlist- options and test -i.year- joint significance via -testparm-:

Code:

. xtreg ln_wage c.age##c.age b71.year, fe vce(cluster idcode)

Fixed-effects (within) regression               Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,710

R-sq:                                           Obs per group:
     within  = 0.1162                                         min =          1
     between = 0.1078                                         avg =        6.1
     overall = 0.0932                                         max =         15

                                                F(16,4709)        =      79.11
corr(u_i, Xb)  = 0.0613                         Prob > F          =     0.0000

                             (Std. Err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0728746    .013687     5.32   0.000     .0460416    .0997075
             |
 c.age#c.age |  -.0010113   .0001076    -9.40   0.000    -.0012224   -.0008003
             |
        year |
         68  |  -.0579959   .0384111    -1.51   0.131    -.1332996    .0173078
         69  |   .0067095   .0271128     0.25   0.805    -.0464444    .0598633
         70  |  -.0295537   .0151488    -1.95   0.051    -.0592524    .0001451
         72  |  -.0069288   .0151116    -0.46   0.647    -.0365547    .0226971
         73  |  -.0155855   .0263223    -0.59   0.554    -.0671895    .0360184
         75  |  -.0428583   .0496147    -0.86   0.388    -.1401263    .0544097
         77  |  -.0239027   .0739429    -0.32   0.747    -.1688653    .1210599
         78  |  -.0042625   .0864863    -0.05   0.961    -.1738162    .1652911
         80  |  -.0210484   .1106238    -0.19   0.849    -.2379228     .195826
         82  |  -.0188272   .1348265    -0.14   0.889    -.2831501    .2454958
         83  |   .0007701   .1469068     0.01   0.996    -.2872359    .2887762
         85  |   .0462799   .1713609     0.27   0.787    -.2896676    .3822274
         87  |   .0662312   .1961237     0.34   0.736    -.3182629    .4507254
         88  |   .1325018   .2120509     0.62   0.532    -.2832172    .5482208
             |
       _cons |   .4517491   .2826119     1.60   0.110    -.1023023    1.005801
-------------+----------------------------------------------------------------
     sigma_u |  .40275174
     sigma_e |  .30127563
         rho |  .64120306   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. testparm(i.year)

 ( 1)  68.year = 0
 ( 2)  69.year = 0
 ( 3)  70.year = 0
 ( 4)  72.year = 0
 ( 5)  73.year = 0
 ( 6)  75.year = 0
 ( 7)  77.year = 0
 ( 8)  78.year = 0
 ( 9)  80.year = 0
 (10)  82.year = 0
 (11)  83.year = 0
 (12)  85.year = 0
 (13)  87.year = 0
 (14)  88.year = 0

       F( 14,  4709) =   10.34
            Prob > F =    0.0000

You can check your model specification via an augmented regression that follows the approachh described in -linktest- entry of Stata .pdf manual.
The folowing toy-example is clearly misspecified as the -test- outcome on -sq-fitted- reaches statistical significance:

Code:

. predict fitted, xb
(24 missing values generated)

. g sq_fitted=fitted^2
(24 missing values generated)

. xtreg ln_wage c.age##c.age b71.year fitted sq_fitted , fe vce(cluster idcode)
note: c.age#c.age omitted because of collinearity

Fixed-effects (within) regression               Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,710

R-sq:                                           Obs per group:
     within  = 0.1173                                         min =          1
     between = 0.1121                                         avg =        6.1
     overall = 0.0952                                         max =         15

                                                F(17,4709)        =      76.35
corr(u_i, Xb)  = 0.0636                         Prob > F          =     0.0000

                             (Std. Err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |  -.0004375   .0123334    -0.04   0.972    -.0246168    .0237418
             |
 c.age#c.age |          0  (omitted)
             |
        year |
         68  |   .0164466   .0396213     0.42   0.678    -.0612297    .0941229
         69  |  -.0000874    .027039    -0.00   0.997    -.0530965    .0529217
         70  |   .0037178   .0157113     0.24   0.813    -.0270836    .0345192
         72  |   .0007766   .0150744     0.05   0.959    -.0287762    .0303294
         73  |    .000099   .0262498     0.00   0.997    -.0513628    .0515608
         75  |   -.000556   .0494942    -0.01   0.991    -.0975878    .0964758
         77  |   .0053053   .0737887     0.07   0.943    -.1393551    .1499657
         78  |   .0134469   .0864027     0.16   0.876    -.1559428    .1828367
         80  |   .0157148   .1104865     0.14   0.887    -.2008905    .2323201
         82  |   .0222533   .1347352     0.17   0.869    -.2418906    .2863973
         83  |    .032282   .1468793     0.22   0.826    -.2556702    .3202341
         85  |   .0578667   .1716478     0.34   0.736    -.2786434    .3943767
         87  |   .0688459   .1964455     0.35   0.726    -.3162791    .4539709
         88  |   .1102907   .2132449     0.52   0.605    -.3077691    .5283505
             |
      fitted |   5.201776   1.085644     4.79   0.000     3.073405    7.330147
   sq_fitted |  -1.321262   .3415637    -3.87   0.000    -1.990887   -.6516372
       _cons |  -3.323554   .9079942    -3.66   0.000    -5.103648   -1.543461
-------------+----------------------------------------------------------------
     sigma_u |  .40189262
     sigma_e |   .3011033
         rho |  .64048345   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. test sq_fitted

 ( 1)  sq_fitted = 0

       F(  1,  4709) =   14.96
            Prob > F =    0.0001

.

Kind regards,
Carlo
(Stata 19.0)

Announcement

xtreg fe versus areg robust errors?

Comment

Comment

Comment

Comment