How Can I test for heteroskedasticity having a pane data?

Enrico Azzini

Join Date: Jul 2020

Posts: 79
#1

How Can I test for heteroskedasticity having a pane data?

13 Jul 2020, 10:33

Hi, I'm not confident with Stata. I 've already read some topic about this argoument however the procedure is still ambiguos for me to apply. I have a panel data with 8 variable observed in 36 countries every year from 1996 to 2016. the panel is unbalanced. I'd like to control if there is heterogeneity, what I should do? I' ve already check for autocorrelation with the command xtserial with the option output but I'm usure about the result. I took a screenshot, I interpret it as no evidece of important autocorrelation however some variable are autocorrelated.

II don't know if I 've been clear. If someone could help me it would be awesome.
ps: after controlling for heteroskedasticity i'd like to test if there are functional misspecification, how can i do it?
thanks for your attention.
Tags: autocorrelation, Functional form, heteroskedasticity, panel data

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17707

13 Jul 2020, 10:51

Enrico:
you can investigate heteroskedasticity issues via -xttest3- community-contributed command.
As far as misspecification of the functional form of the dependent variable is concerned, you can consider the following toy-example (that works regrdless -fe- or -re- specification):

Code:

. use "https://www.stata-press.com/data/r16/nlswork.dta"
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

. xtreg ln_wage c.age##c.age i.race

Random-effects GLS regression                   Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,710

R-sq:                                           Obs per group:
     within  = 0.1087                                         min =          1
     between = 0.1175                                         avg =        6.1
     overall = 0.1048                                         max =         15

                                                Wald chi2(4)      =    3498.50
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0594573   .0027157    21.89   0.000     .0541346      .06478
             |
 c.age#c.age |  -.0006835    .000045   -15.18   0.000    -.0007717   -.0005952
             |
        race |
      black  |  -.1237269   .0127651    -9.69   0.000    -.1487461   -.0987077
      other  |   .0965773   .0532529     1.81   0.070    -.0077965    .2009511
             |
       _cons |   .5761164   .0398472    14.46   0.000     .4980173    .6542155
-------------+----------------------------------------------------------------
     sigma_u |  .36094993
     sigma_e |  .30245467
         rho |   .5874941   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. predict fitted, xb
(24 missing values generated)

. g sq_fitted=fitted^2
(24 missing values generated)

. xtreg ln_wage c.age##c.age i.race fitted sq_fitted
note: c.age#c.age omitted because of collinearity

Random-effects GLS regression                   Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,710

R-sq:                                           Obs per group:
     within  = 0.1101                                         min =          1
     between = 0.1157                                         avg =        6.1
     overall = 0.1043                                         max =         15

                                                Wald chi2(5)      =    3533.86
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0061254   .0016687     3.67   0.000     .0028547     .009396
             |
 c.age#c.age |          0  (omitted)
             |
        race |
      black  |  -.0354914   .0163136    -2.18   0.030    -.0674655   -.0035173
      other  |   .0454009   .0542382     0.84   0.403    -.0609039    .1517058
             |
      fitted |   3.089217   .3773461     8.19   0.000     2.349633    3.828802
   sq_fitted |  -.7322458   .1302198    -5.62   0.000    -.9874719   -.4770197
       _cons |  -1.603282   .2953293    -5.43   0.000    -2.182116   -1.024447
-------------+----------------------------------------------------------------
     sigma_u |  .36086556
     sigma_e |  .30221986
         rho |   .5877572   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. test sq_fitted

 ( 1)  sq_fitted = 0

           chi2(  1) =   31.62
         Prob > chi2 =    0.0000

.

As the -test- outcokme reaches statistical significance, the model is misspecified.

Kind regards,
Carlo
(Stata 19.0)

Comment

Chris Boudreaux

Join Date: Jul 2020

Posts: 83
#3

13 Jul 2020, 18:15

That’s neat, Carlo. Thank you for sharing.
Comment
Enrico Azzini

Join Date: Jul 2020

Posts: 79
#4

14 Jul 2020, 02:06

Thank you Carlo! For what concern the pictures above is it righ the interpretation?
If there is evidence of heteroskedasticity I should use xtreg, fe robust or xtgls right?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#5

14 Jul 2020, 02:27

Enrico:
the results of the test you provided (by the way: as per FAQ, please use CODE delimiters to share what you typed and what Stata gave you back; whenewer you use a community-contributed command, such as -xtserial- , please state that. Thanks) does not support evidence of AR1 serial correlation, hence:
1) if you do not detect heteroskedasticity, you can stick with default standard errors, Nowever, as your T dimension does not seem that negligible, I would compare the model with and without default standard errors;
2) if you detect heteroskedasticity, you should switch to robust or clustered standard errors (please note that, under -xtreg-, both options do the very same job);
3) in a hypthetical scenario where you detect both heteroskedasticity and autocorrelation, 2) still holds;
4) -xtgls- (and .xtregar-) are for T>N panel datasets.

Kind regards,
Carlo
(Stata 19.0)
Comment
Enrico Azzini

Join Date: Jul 2020

Posts: 79
#6

14 Jul 2020, 03:51

ok, understood.
thanks, I think I have finished the questions. Thank you so much for your precious help!
Best regards.
Comment
Enrico Azzini

Join Date: Jul 2020

Posts: 79
#7

14 Jul 2020, 09:06

Hi, I've another question: What can I do to inspect endogenity for some independent variables? I've already look https://www.ifs.org.uk/docs/wooldrid...ession%204.pdf but I didn't understood in which case I fall. Moreover I've tried to use xtivreg2 but the estimates is efficient for homoskedasticity only and it is statistics consistent for homoskedasticity only but I've heteroskedasticity.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#8

14 Jul 2020, 09:59

Enrico:
most of the times endogeneity is detected on a theoretical basis.
That said, a good first step would be to test whether your model is misspecified (if so, it might be due to endogeneity).

Kind regards,
Carlo
(Stata 19.0)
Comment
Enrico Azzini

Join Date: Jul 2020

Posts: 79
#9

14 Jul 2020, 10:56

Thank you Carlo I feel guilty to make this trivial question so I hope to not boring you; you have already help me a lot! I want test the quality of institution on %growth GDP. I suspect that I make some confusion because I thought that endogeneity and reverse causality were the same things but is it correct? There is no way to use a command that easly provide me some result for reverse causality?
Best regards
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#10

14 Jul 2020, 11:43

Enrico:
reverse causality is a form of endogeneity (low income and depression, both can be cause and effect of the other one, other things being equal). The same might be true in your case: are good institutions that, other things being equal, increase GDP or the other way round?
I do not think that you can rely on a hard and fast rule to detect endogeneity: I would rather skim through the literature in your research field and see whether some warnings concerning endogenous regressors do exist.

Kind regards,
Carlo
(Stata 19.0)
Comment
Enrico Azzini

Join Date: Jul 2020

Posts: 79
#11

15 Jul 2020, 03:21

Got it, thank you very much for your help!
Comment

Announcement

How Can I test for heteroskedasticity having a pane data?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment