Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel Regression

    Hello, I am using StataSE 14 and currently working on my final thesis. My advisor told me to robust check the overall sample and I am wondering what kind of tests should run on my panel data to ensure the validity of my test. N=12, T=10, so my total observations are 120. My advisor also advised me to run a sub-sample regression as comparison.

    I'm confused as to whether I can use OLS with -reg- or I should use Fixed/Random with -xtreg-? And if I use OLS, would using -rreg- be enough for the robust check or should I use sensitivity analysis with lag(1)?

  • #2
    Steven:
    welcome to this forum.
    First of all, I'm not clear whether you used -regress- or -xtreg- (as you should have, as a first choice, at least) for your panel data regression.
    That said, whenever it comes to robustness issues, the first question to address is: robustness with respect to what?
    Usually, regression models should be checked for model misspecification (that can hide endogeneity) and heteroskedasticity in the idiosyncratic residual distribution.
    Finally, as you do not seem to have a large sample, it would be wise to limit the number of your predictors.
    As reminded by the FAQ, more helpful replies are conditional on posting what you typed and what Stata gave you back.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Thank you for your response. I tried using xtreg in the beginning, but I encountered difficulty in analyzing the sub-sample (sub1=40obs, sub2=80obs).

      But when I ran the -xtreg, fe- and -xtreg, re- regression on the sub-sample, the hausman test result showed "(V_b-V_B is not positive definite)" and "Note: the rank of the differenced variance matrix (5) does not equal the number of coefficients being tested (7)" when I add sigmamore. Is it possible to use the hausman test result after those messages, or would it be better to use OLS regression?

      Comment


      • #4
        Steven:
        I fail to get what you mean by "difficulty in analyzing the subsample": usually, that can be easily done just adding a two-level categorical variable (say, 0=sub1; 1=sub2) among the set of predictors.
        That said, the nuisance you report with -hausman- is pretty frequent.
        Perhaps you can give it a try wiith the community-contributed command -xtoverid- (that needs the -re- specification only to work, as the null is that -re- is the way to go), as you can see in the following toy-example:
        Code:
        . use "https://www.stata-press.com/data/r16/nlswork.dta"
        (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
        
        . xtreg ln_wage age, re
        
        Random-effects GLS regression                   Number of obs     =     28,510
        Group variable: idcode                          Number of groups  =      4,710
        
        R-sq:                                           Obs per group:
             within  = 0.1026                                         min =          1
             between = 0.0877                                         avg =        6.1
             overall = 0.0774                                         max =         15
        
                                                        Wald chi2(1)      =    3140.35
        corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
        
        ------------------------------------------------------------------------------
             ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                 age |   .0185667   .0003313    56.04   0.000     .0179174    .0192161
               _cons |   1.120439   .0112038   100.01   0.000      1.09848    1.142398
        -------------+----------------------------------------------------------------
             sigma_u |  .36972456
             sigma_e |  .30349389
                 rho |  .59743613   (fraction of variance due to u_i)
        ------------------------------------------------------------------------------
        
        . xtoverid
        
        Test of overidentifying restrictions: fixed vs random effects
        Cross-section time-series model: xtreg re  
        Sargan-Hansen statistic  17.401  Chi-sq(1)    P-value = 0.0000
        
        .
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          What I meant regarding the sub-sample is that aside from analyzing the overall 120 observations, I was told to analyze the data in two smaller groups consisting of 40 observations and 80 observations.

          I tried your suggestion by using -xtoverid-, but the response given by stata is "Error - saved RE estimates are degenerate (sigma_u=0) and equivalent to pooled OLS r(198);". Both the -xtreg, re- regression on the smaller groups have 0 value on the sigma_u and rho, and the sigma_e is >1.

          Comment


          • #6
            Steven:
            the results you got from Stata point you toward pooled OLS, due to the lack of evidence of a panel-wise effect.
            Actually, your subsample are really small and you probably have to further reduce the number of your predictors if you want to analyze them separately.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Hello, Carlo. Thank you so much for your input. I just recently presented my OLS regression results to my advisor and he told me to use panel regression through -xtreg, fe- as I'm analyzing countries and it's supposedly used fixed-effects model, but since I'm still an undergrad, I need to perform hausman test as a confirmation of using fixed-effects.

              The robust check that my advisor told me to perform is through using different combinations of control variables in the regression to ensure the result is the same. However, I find that despite the independent variable is found to be significant under 5% level on some regressions and under 1% level on the rest, and the _constant is not significant on some combinations and significant on the rest.

              In order to say my model is robust, do the independent variable and _constant have to be under the same significant level at every combination or is it okay as long as the independent variable is consistently found to be significant?

              Comment


              • #8
                Steven:
                do not bother yourself with _constant being above/below the arbitrary (and oversold) 5% threshold.
                This especially holds for -xtreg,fe- where the _constant relevance is really negligible (see: https://www.stata.com/support/faqs/s...effects-model/).
                As an aside, performing -hausman- test has nothing to do with your education level!
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment

                Working...
                X