Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data: testing for serialcorrelation and heteroskedasticity

    Hello,

    I've got a panel data set with 200 banks, with data from 2002-2016 with varying degrees of data availability. On average there is about 8.5 years of data available per bank. I've got a dependent variable: Default risk. And several explanatory variables: Board Characteristics for each bank. Plus various control variables.

    Now I was wondering how I should go about testing for serialcorrelation and heteroskedasticity. I've already read this https://www.stata.com/support/faqs/s...tocorrelation/ but that is not entirely what I am looking for as it is unclear to me what is exactly happening. I would like to manually run tests for serial correlation and heteroskedasticity.

    I've done a Breusch-Godfrey test for serial correlation before but not on a panel dataset, just on time series. I this also suitable for panel data? And how would I perform this test for panel data?

    Similarly, I've done a Breusch-Pagan test for heteroskedasticity before, but never on panel data, is this suitable for panel data?

    Some help would be greatly appreciated, as I am new to panel data analysis.

    Kind regards,

    Niels

  • #2
    Niels:
    you seem to have a large N, small T panel dataset: hence, assuming a continuous dependent variable (that is, a score for default risk), I would go -xtreg-.
    You can graphically inspect your residual distribution and see whether a heteroskedasticity-suggestive pattern comes alive.
    If that were the case, you can robustifying/clustering your standard errors (these options do the same job under -xtreg-) and account for heteroskedasticity and/or autocorrelation (the latter is usually a minor nuisance in panels like the the you're supposed to deal with).
    In my assumptions about yor panel were incorrect (say, your dependent variable is categorical: default risk yes/no), -please provide the list ith furthere details.
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      Carlo, thank you for taking the time to reply I appreciate it.

      Your assumptions are indeed correct, default risk is a score. I was wondering if a Breusch-Godfrey and Breusch-Pagan test would be suitable? Or alternatives? As this is in regards to my thesis, I prefer using objective tests as opposed to analyzing patterns. This would also allow me to incorporate the results of these tests into my thesis.

      Comment


      • #4
        Niels:
        see the user-written programmes -xttest2- and -xttest3-.
        Kind regards,
        Carlo
        (Stata 18.0 SE)

        Comment


        • #5
          Carlo, once again thank you for the quick response.

          xttest2: 'XTTEST2': module to perform Breusch-Pagan LM test for cross-sectional correlation in fixed effects model

          I forgot to mention that I am planning on using a random effects estimator, so it appears that this method is not suitable for me I suppose? Same goes for xttest3.

          Comment


          • #6
            Niels:
            you have to replace -xtreg- with -xtgls-, then.
            Kind regards,
            Carlo
            (Stata 18.0 SE)

            Comment


            • #7
              Niels:
              before switching to -xtgls- despite havibng a large N, small T dataset,, please note the dramatically different times (in seconds) taken by -xtreg- and -xtgls- to perform the same simple panel data regression:
              Code:
              . set rmsg on
              r; t=0.00 15:48:21
              
              . xtreg ln_wage i.race, re
              
              Random-effects GLS regression                   Number of obs     =     28,534
              Group variable: idcode                          Number of groups  =      4,711
              
              R-sq:                                           Obs per group:
                   within  = 0.0000                                         min =          1
                   between = 0.0198                                         avg =        6.1
                   overall = 0.0186                                         max =         15
              
                                                              Wald chi2(2)      =      99.02
              corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
              
              ------------------------------------------------------------------------------
                   ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
                      race |
                    black  |  -.1300382    .013486    -9.64   0.000    -.1564702   -.1036062
                    other  |   .1011474   .0562889     1.80   0.072    -.0091768    .2114716
                           |
                     _cons |   1.691756   .0071865   235.41   0.000     1.677671    1.705841
              -------------+----------------------------------------------------------------
                   sigma_u |  .38195681
                   sigma_e |  .32028665
                       rho |  .58714668   (fraction of variance due to u_i)
              ------------------------------------------------------------------------------
              r; t=0.61 15:48:28
              
              . xtgls ln_wage i.race
              
              Cross-sectional time-series FGLS regression
              
              Coefficients:  generalized least squares
              Panels:        homoskedastic
              Correlation:   no autocorrelation
              
              Estimated covariances      =         1          Number of obs     =     28,534
              Estimated autocorrelations =         0          Number of groups  =      4,711
              Estimated coefficients     =         3          Obs per group:
                                                                            min =          1
                                                                            avg =   6.056888
                                                                            max =         15
                                                              Wald chi2(2)      =     542.80
              Log likelihood             =    -19162          Prob > chi2       =     0.0000
              
              ------------------------------------------------------------------------------
                   ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
              -------------+----------------------------------------------------------------
                      race |
                    black  |  -.1427862    .006243   -22.87   0.000    -.1550222   -.1305502
                    other  |    .080671   .0274112     2.94   0.003      .026946     .134396
                           |
                     _cons |   1.714338   .0033339   514.21   0.000     1.707804    1.720873
              ------------------------------------------------------------------------------
              r; t=692.49 16:00:07
              .
              A possible work-around could be:
              -skipping -xttest2- and -xttest3-;
              - graphically inspect your residual distribution;
              -robustify/cluster your standard errors if you suspect that (especially) heteroskedasticity can bite your results (as said, serial correlation is expected to be a minor nuisance with a short T dimension).

              Otherwise, as many econometricians usually do, go -cluster-/-robust- from scratch; with 200 -panelid- you have enough clusters to survive.
              Kind regards,
              Carlo
              (Stata 18.0 SE)

              Comment


              • #8
                In finance journals, you can find that the "common" way to deal with serial correlation and heteroskedasticity is to (directly) using "clustered standard errors".
                Ho-Chuan (River) Huang
                Stata 17.0, MP(4)

                Comment


                • #9
                  Originally posted by River Huang View Post
                  In finance journals, you can find that the "common" way to deal with serial correlation and heteroskedasticity is to (directly) using "clustered standard errors".
                  Do you mean using "robust" option for xtreg?

                  Comment


                  • #10
                    Erol:
                    please note that under -xtreg- -robust- and -cluster- options do the same job.
                    That feature does not apply to -regress-, where -robust- and -cluster- options are totally different beasts.
                    Kind regards,
                    Carlo
                    (Stata 18.0 SE)

                    Comment


                    • #11
                      Carlo has the right answer.

                      Ho-Chuan (River) Huang
                      Stata 17.0, MP(4)

                      Comment


                      • #12
                        Carlo and River, thank you very much for your replies. They are very helpful.

                        As this is for my thesis, the point is also to show that I've run the tests and show the results, not just for deciding whether I should use robust options. So how do I run the test after running the xtgls regression? It is unclear to me what's happening exactly with the xtgls command.

                        I'd like to run a Breusch-Pagan test for heteroskedasticity as I said, as my econometrics instructor told me I can use this for panel data aswell. Is just using the -regress- and than -hettest- commands okay?

                        Also, for serial correlation (I know you said this is probably not a problem, but I'd like to include it in my thesis anyway) I've run the -xtserial- command which runs the Wooldridge (2002) test for serial correlation, is this ok? Also, I'd like to know if it is possible to use the White test, as a friend recommended it to me.

                        The aforementioned tests indicate that there is both serial correlation and heteroskedasticity in my data, thus this would lead me to use the robust option.
                        Last edited by Niels Meijer; 18 Sep 2017, 03:04.

                        Comment


                        • #13
                          I think White's is most commonly used for time series rather than panels. You could use the xtqptest command from SSC, which is a bit more flexible and powerful than xtserial. If you like testing, you can also run xtcdf to check for cross-sectional dependence.

                          Comment


                          • #14
                            Niels:
                            whether the user-written programme -xtserial- is OK for testing serial correlation, the BP test that Stata offers for panel data (-xttest0-) tests random effect specification, not heteroskedasticity (however, it's true that a BP test for testing heteroskedasticity as a -regress postestimation- command is available in Stata).
                            Again, you should consider -xttest2- for heteroskedasticity checking, keeping in mind that it works for -xtreg, fe- only (or -xtgls).

                            PS: Crossed in cyberspace with Jesse's helpful reply, that, interestingly, shows different takes.
                            Kind regards,
                            Carlo
                            (Stata 18.0 SE)

                            Comment


                            • #15
                              Thank you once again for your responses, I ended up using xtserial (Wooldridge test) and Breusch-Pagan test. Indicating both the presence of autocorrelation and heteroskedasticity. I'll run the test for cross-sectional dependence too I think. Are robust/clustered errors also able to overcome cross-sectional interdependence?

                              Comment

                              Working...
                              X