Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with Panel Data Model Selection with heteroskedasticity and autocorrelation

    Greetings to all Statalist members,

    My data is a case of balanced panel data with N=12 and T=6 making total 72 observation. The study's purpose is to predict effect of independent variables on dependent variables. There are 3 independent variables (STDTA,LTDTA,DE), 5 independent variables (ROE,ROA,NPM,PE,TQ) and 3 control variables (FS,AG,AT). 5 Independent variable indicate that there will be five equations for each of the model given below.

    In order to select appropriate model among Pooled OLS, Fixed Effect or Random Effect model for linear panel regression, i have done following tests for ROA Equation:

    1. For RE or FE model, hausman test result indicate to use FE model:
    Code:
    Prob>chi2 =      0.0242
    2. Wooldridge test for autocorrelation in panel data result does not reject null hypothesis
    Code:
        F(  1,      11) =      6.905
               Prob > F =      0.0235
    3. Modified Wald test for groupwise heteroskedasticity in fixed effect regression model does not reject null hypothesis as well.
    Code:
    chi2 (12)  =   15089.82
    Prob>chi2 =      0.0000
    4. Above results implies that FE model should be selected and the data contains auto correlation and heteroskedasticity issue.

    Questions for ROA Equation:
    1. The robust/cluster command for xtreg , fe deal with heteroskedasticity and xtregar command deal with autocorrelation. How to deal with both issues at same time?
    2. Can i use .xtgls to deal with heteroskedasticity and autocorrelation in this scenario (Because using .xtgls produce significant results for this equation)?


    For all Equations, according to .xtserial, .xttest3 and .hettest autocorrelation and heteroskedasticity exist. I have some questions which need simple answers:
    1. Using .xtgls with hetero and autocorrelation will produce significant model and parameter for all equations except one. So can i use xtgls for my study or not?
    2. I have read that .xtgls does not produce R-square owing to GLS characteristics, should i stop bothering with R-squared completely?


    Note: I don't have English as my native language, so really sorry if i didn't present my questions and circumstances with clarity.
    I have studied many econometric books and searched google and many forums since last six months. Since i am not a math dude, all they have lead me to is confusion.
    I am really pressed for time , so i am expecting simple answers which can get through this thick head of mine.

    I will be grateful for your help in my predicament. Thanks

    Regards
    Abdul Qayyum

  • #2
    Abdul:
    welcome to the list.
    Two remarks about your query:
    _you seem to have far too many predictors (11) for such a limited sample size (72 obs);
    _the tests reported at 2. and 3. do reject the null.
    Kind regards,
    Carlo
    (StataNow 18.5)

    Comment


    • #3
      Abdul:
      just to top-off my previous reply, the robust/cluster command for xtreg , fe deal with both heteroskedasticity and autocorrelation (please, see Example 3: Fixed-effects models with robust standard errors, -xtreg- entry, Stata .pdf manual).
      Kind regards,
      Carlo
      (StataNow 18.5)

      Comment


      • #4
        Thanks Calro Lazzaro for your quick reply.
        Sorry for mixing alternative with null hypothesis erroneously. (Is there any option to edit my first post, i am unable to find one).

        Correct me if i am getting it wrongly, the study contains only 3 predictor (independent) variables which are STDTA, LTDTA and DE, also 3 control variables (FS, AG, and AT). There are 5 dependent variables (ROE, ROA, NPM, PE & TQ) which the study is trying to predict.
        For instance the regression commands for all equations (dependent variables) will look in stata like this:
        Code:
        xtreg, ROE STDTA LTDTA DE FS AG AT, fe
        xtreg, ROA STDTA LTDTA DE FS AG AT, fe
        xtreg, NPM STDTA LTDTA DE FS AG AT, fe
        xtreg, PE STDTA LTDTA DE FS AG AT, fe
        xtreg, TQ STDTA LTDTA DE FS AG AT, fe
        alongwith robust/cluster command option

        Can you give your advice/insights on the use of .xtgls in this scenario? Or when xtgls should be used?


        The main reason for repeatedly asking about .xtgls estimate is because it is the only estimate which provide significant parameters for all 5 dependent variables.

        Regards,
        Abdul Qayum

        Comment


        • #5
          Abdul:
          - I would go for a more parsimoniuos model (5 different regression equations seem like an overkill to me); hence, I would skim the literatiure to see what Others did in the past when presented with the same reasearch topic;
          - -xtgls- is not the best choice in your case, since you have a "large N, small T" panel dataset; relying upon -hausman- verdict, you should go -fe-, as you reported (however, please consider that -cluster()- option is not allowed with -hausman-;see the unofficial Stata command -help xtoverid-, instead );
          - as a general remark, preferring one method over another only in the light of statistical significant results sounds like a weak approach, especially when you deal with such a small sample like yours.
          Kind regards,
          Carlo
          (StataNow 18.5)

          Comment


          • #6
            "Carlo"
            Really grateful for your valuable time and inputs.

            Hopefully this will be my last question;
            For NPM equation
            Code:
            xtreg, NPM STDTA LTDTA DE FS AG AT, FE
            xtreg, NPM STDTA LTDTA DE FS AG AT, FE
            1."hausman" test support Fixed Effect Model.
            2.On the other hand "Breusch and Pagan Lagrangian multiplier" (command .xttest0) used for deciding between Random Effect and Pooled OLS favor Random Effect.
            What should i do in this situation? Should i chose Fixed Effect or Random Effect?

            Hope i will not face any further issue and bother you again.

            Best Regards,
            Abdul Qayum

            Comment


            • #7
              Abdul:
              I would go -fe-.
              Kind regards,
              Carlo
              (StataNow 18.5)

              Comment


              • #8
                Hi, I have similar problem. I ran my model and it said that it has heteroscedasticity problem. I am using FEM and when I robust my model, it all became non significant. I don't know what's wrong. So can you guys help me?

                Comment


                • #9
                  Kenny:
                  please post what you typed and what Stata gave you back (as per FAQ). Thanks.
                  Kind regards,
                  Carlo
                  (StataNow 18.5)

                  Comment


                  • #10
                    Dear all,

                    I have more or less the same question and am a bit confused which analysis to use in STATA to generate the right results. The analysis of my unbalanced panel dataset implies that the FE model has to be used, next to this both heteroskedasticity and autocorrelation are present. I did a lot of research on the internet and articles and different options show up on how to deal with this, I'm not sure which model is the most valid for this particular case. The options that I found are:

                    xtreg, fe robust - however, my results turn up to be non-significant when using this analysis
                    xtreg, fe vce(robust) - however, this option does not control for autocorrelation according to the article of Hoechle.
                    xtscc, fe

                    As far as I understand the xtscc, fe option turns out to be the best option. However, I'm not sure if the sample is cross-sectionally dependent.

                    Can someone please tell me which option is the best option to use in this case?

                    Kind regards,

                    Jeroen


                    Comment


                    • #11
                      Jeroen:
                      welcome to the list.
                      As reported in the helpfile for the user-written command -xtscc- (as per FAQ and the reasons explained therein, please reportwhere did you get any user-written command from. Thanks), this one seems to be a wise choice to fix your problem when you have small N, large T panel data.
                      Conversely, if, as it ofteh happens, your dataset is, a large N, small T one, you should be probably better off with:
                      Code:
                      xtreg, fe vce(robust)
                      Kind regards,
                      Carlo
                      (StataNow 18.5)

                      Comment


                      • #12
                        Dear Carlo,

                        Thank you for your quick reply. So if I'm right in my particular case with N=39 and T=9 the code you suggest: xtreg, fe vce(robust) would be the right code to apply to run analysis with my data sample? What about the autocorrelation problem that still is present in this case? Thank you in advance again!

                        Kind regards,

                        Jeroen
                        Last edited by Jeroen vanDam; 18 Jun 2016, 07:39.

                        Comment


                        • #13
                          Jeroen:
                          yes, I would go:
                          Code:
                          xtreg, fe vce(robust

                          As far as autocorrelation is concerned, -vce(robus) accomodates for heteroskedasticity and/or autocorrelation, as reported in Example 3: Fixed-effects models with robust standard errors, -xtreg- entry, Stata (not STATA, please) .pdf manual.

                          Kind regards,
                          Carlo
                          (StataNow 18.5)

                          Comment


                          • #14
                            Dear Carlo,

                            I found the same in the example you mentioned indeed. Is it right that this code omittes control variables that remain the same during the whole time period since the fixed effects model controls for each entity within the sample? (I'm not that experienced with STATA/statistics as you might have noticed). Thank you again!

                            Kind regards,

                            Jeroen

                            Comment


                            • #15
                              Jeroen:
                              -fe- specification cancels out time-invariant predictors (i.e., those that do not change across years, such as race, for instance).
                              Hence, you will not get any estimation for them in your regression table, as you can see from the following example:
                              Code:
                              . use "http://www.stata-press.com/data/r14/nlswork.dta", clear
                              (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
                              
                              
                              . xtreg ln_wage race hours, fe
                              note: race omitted because of collinearity
                              
                              Fixed-effects (within) regression               Number of obs     =     28,467
                              Group variable: idcode                          Number of groups  =      4,710
                              
                              R-sq:                                           Obs per group:
                                   within  = 0.0001                                         min =          1
                                   between = 0.0314                                         avg =        6.0
                                   overall = 0.0074                                         max =         15
                              
                                                                              F(1,23756)        =       3.14
                              corr(u_i, Xb)  = 0.0976                         Prob > F          =     0.0764
                              
                              ------------------------------------------------------------------------------
                                   ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                              -------------+----------------------------------------------------------------
                                      race |          0  (omitted)
                                     hours |   .0004474   .0002525     1.77   0.076    -.0000475    .0009423
                                     _cons |   1.658941   .0094249   176.02   0.000     1.640468    1.677415
                              -------------+----------------------------------------------------------------
                                   sigma_u |   .4229084
                                   sigma_e |  .32040339
                                       rho |  .63532952   (fraction of variance due to u_i)
                              ------------------------------------------------------------------------------
                              F test that all u_i=0: F(4709, 23756) = 8.30                 Prob > F = 0.0000
                              
                              . xtreg ln_wage race hours, re
                              
                              Random-effects GLS regression                   Number of obs     =     28,467
                              Group variable: idcode                          Number of groups  =      4,710
                              
                              R-sq:                                           Obs per group:
                                   within  = 0.0001                                         min =          1
                                   between = 0.0224                                         avg =        6.0
                                   overall = 0.0186                                         max =         15
                              
                                                                              Wald chi2(2)      =      99.76
                              corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
                              
                              ------------------------------------------------------------------------------
                                   ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                              -------------+----------------------------------------------------------------
                                      race |  -.1032299   .0123018    -8.39   0.000    -.1273408   -.0791189
                                     hours |   .0013406   .0002407     5.57   0.000     .0008688    .0018124
                                     _cons |   1.742316   .0190717    91.36   0.000     1.704936    1.779696
                              -------------+----------------------------------------------------------------
                                   sigma_u |  .37431348
                                   sigma_e |  .32040339
                                       rho |  .57713559   (fraction of variance due to u_i)
                              ------------------------------------------------------------------------------
                              Kind regards,
                              Carlo
                              (StataNow 18.5)

                              Comment

                              Working...
                              X