Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How Can I test for heteroskedasticity having a pane data?

    Hi, I'm not confident with Stata. I 've already read some topic about this argoument however the procedure is still ambiguos for me to apply. I have a panel data with 8 variable observed in 36 countries every year from 1996 to 2016. the panel is unbalanced. I'd like to control if there is heterogeneity, what I should do? I' ve already check for autocorrelation with the command xtserial with the option output but I'm usure about the result. I took a screenshot, I interpret it as no evidece of important autocorrelation however some variable are autocorrelated.
    Click image for larger version

Name:	METà.png
Views:	1
Size:	18.6 KB
ID:	1563223
    Click image for larger version

Name:	altra metà.png
Views:	1
Size:	13.1 KB
ID:	1563224
    II don't know if I 've been clear. If someone could help me it would be awesome.
    ps: after controlling for heteroskedasticity i'd like to test if there are functional misspecification, how can i do it?
    thanks for your attention.

  • #2
    Enrico:
    you can investigate heteroskedasticity issues via -xttest3- community-contributed command.
    As far as misspecification of the functional form of the dependent variable is concerned, you can consider the following toy-example (that works regrdless -fe- or -re- specification):
    Code:
    . use "https://www.stata-press.com/data/r16/nlswork.dta"
    (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
    
    . xtreg ln_wage c.age##c.age i.race
    
    Random-effects GLS regression                   Number of obs     =     28,510
    Group variable: idcode                          Number of groups  =      4,710
    
    R-sq:                                           Obs per group:
         within  = 0.1087                                         min =          1
         between = 0.1175                                         avg =        6.1
         overall = 0.1048                                         max =         15
    
                                                    Wald chi2(4)      =    3498.50
    corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
         ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             age |   .0594573   .0027157    21.89   0.000     .0541346      .06478
                 |
     c.age#c.age |  -.0006835    .000045   -15.18   0.000    -.0007717   -.0005952
                 |
            race |
          black  |  -.1237269   .0127651    -9.69   0.000    -.1487461   -.0987077
          other  |   .0965773   .0532529     1.81   0.070    -.0077965    .2009511
                 |
           _cons |   .5761164   .0398472    14.46   0.000     .4980173    .6542155
    -------------+----------------------------------------------------------------
         sigma_u |  .36094993
         sigma_e |  .30245467
             rho |   .5874941   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    
    . predict fitted, xb
    (24 missing values generated)
    
    . g sq_fitted=fitted^2
    (24 missing values generated)
    
    . xtreg ln_wage c.age##c.age i.race fitted sq_fitted
    note: c.age#c.age omitted because of collinearity
    
    Random-effects GLS regression                   Number of obs     =     28,510
    Group variable: idcode                          Number of groups  =      4,710
    
    R-sq:                                           Obs per group:
         within  = 0.1101                                         min =          1
         between = 0.1157                                         avg =        6.1
         overall = 0.1043                                         max =         15
    
                                                    Wald chi2(5)      =    3533.86
    corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
         ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             age |   .0061254   .0016687     3.67   0.000     .0028547     .009396
                 |
     c.age#c.age |          0  (omitted)
                 |
            race |
          black  |  -.0354914   .0163136    -2.18   0.030    -.0674655   -.0035173
          other  |   .0454009   .0542382     0.84   0.403    -.0609039    .1517058
                 |
          fitted |   3.089217   .3773461     8.19   0.000     2.349633    3.828802
       sq_fitted |  -.7322458   .1302198    -5.62   0.000    -.9874719   -.4770197
           _cons |  -1.603282   .2953293    -5.43   0.000    -2.182116   -1.024447
    -------------+----------------------------------------------------------------
         sigma_u |  .36086556
         sigma_e |  .30221986
             rho |   .5877572   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    
    . test sq_fitted
    
     ( 1)  sq_fitted = 0
    
               chi2(  1) =   31.62
             Prob > chi2 =    0.0000
    
    .
    As the -test- outcokme reaches statistical significance, the model is misspecified.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      That’s neat, Carlo. Thank you for sharing.

      Comment


      • #4
        Thank you Carlo! For what concern the pictures above is it righ the interpretation?
        If there is evidence of heteroskedasticity I should use xtreg, fe robust or xtgls right?

        Comment


        • #5
          Enrico:
          the results of the test you provided (by the way: as per FAQ, please use CODE delimiters to share what you typed and what Stata gave you back; whenewer you use a community-contributed command, such as -xtserial- , please state that. Thanks) does not support evidence of AR1 serial correlation, hence:
          1) if you do not detect heteroskedasticity, you can stick with default standard errors, Nowever, as your T dimension does not seem that negligible, I would compare the model with and without default standard errors;
          2) if you detect heteroskedasticity, you should switch to robust or clustered standard errors (please note that, under -xtreg-, both options do the very same job);
          3) in a hypthetical scenario where you detect both heteroskedasticity and autocorrelation, 2) still holds;
          4) -xtgls- (and .xtregar-) are for T>N panel datasets.
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            ok, understood.
            thanks, I think I have finished the questions. Thank you so much for your precious help!
            Best regards.

            Comment


            • #7
              Hi, I've another question: What can I do to inspect endogenity for some independent variables? I've already look https://www.ifs.org.uk/docs/wooldrid...ession%204.pdf but I didn't understood in which case I fall. Moreover I've tried to use xtivreg2 but the estimates is efficient for homoskedasticity only and it is statistics consistent for homoskedasticity only but I've heteroskedasticity.


              Comment


              • #8
                Enrico:
                most of the times endogeneity is detected on a theoretical basis.
                That said, a good first step would be to test whether your model is misspecified (if so, it might be due to endogeneity).
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Thank you Carlo I feel guilty to make this trivial question so I hope to not boring you; you have already help me a lot! I want test the quality of institution on %growth GDP. I suspect that I make some confusion because I thought that endogeneity and reverse causality were the same things but is it correct? There is no way to use a command that easly provide me some result for reverse causality?
                  Best regards

                  Comment


                  • #10
                    Enrico:
                    reverse causality is a form of endogeneity (low income and depression, both can be cause and effect of the other one, other things being equal). The same might be true in your case: are good institutions that, other things being equal, increase GDP or the other way round?
                    I do not think that you can rely on a hard and fast rule to detect endogeneity: I would rather skim through the literature in your research field and see whether some warnings concerning endogenous regressors do exist.
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment


                    • #11
                      Got it, thank you very much for your help!

                      Comment

                      Working...
                      X