Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data with Pooled OLS regression and testing

    Dear All,

    I am running a panel dataset, with N = 26 and T = 13 (balanced, no missing data). I checked to see should I regress the model with REM/FEM, but the p-value is insignificant so I believe it's better with Pooled OLS?

    Code:
    xtreg roa fo size lage leverage div_payout asset_turnover
    
    Random-effects GLS regression                   Number of obs     =        338
    Group variable: code                            Number of groups  =         26
    
    R-squared:                                      Obs per group:
         Within  = 0.9786                                         min =         13
         Between = 0.9959                                         avg =       13.0
         Overall = 0.9828                                         max =         13
    
                                                    Wald chi2(6)      =   18927.20
    corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000
    
    --------------------------------------------------------------------------------
               roa | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    ---------------+----------------------------------------------------------------
                fo |   .0002094   .0006397     0.33   0.743    -.0010444    .0014633
              size |  -.0002191   .0000968    -2.26   0.024    -.0004088   -.0000295
              lage |  -.0004467   .0001438    -3.11   0.002    -.0007286   -.0001648
          leverage |   .0016867   .0018883     0.89   0.372    -.0020144    .0053878
        div_payout |   .0004386   .0000608     7.22   0.000     .0003195    .0005578
    asset_turnover |   1.108709   .0102544   108.12   0.000      1.08861    1.128807
             _cons |   .0063234   .0024668     2.56   0.010     .0014884    .0111583
    ---------------+----------------------------------------------------------------
           sigma_u |          0
           sigma_e |  .00108588
               rho |          0   (fraction of variance due to u_i)
    --------------------------------------------------------------------------------

    Code:
    xttest0
    
            Estimated results:
                             |       Var     SD = sqrt(Var)
                    ---------+-----------------------------
                         roa |   .0000706       .0084049
                           e |   1.18e-06       .0010859
                           u |          0              0
    
            Test: Var(u) = 0
                                 chibar2(01) =     0.00
                              Prob > chibar2 =   1.0000
    Since I do not have much experience with Pooled OLS (I worked with xtreg, re/fe more frequent), I would like to ask:
    1. Is the correct command for Pooled OLS is?:
      Code:
      reg roa fo size lage leverage div_payout asset_turnover, vce (cluster code)
    2. If I want to test for heteroskedasticity, would imtest, white work?
    3. Does xtserial work in this case of checking for serial correlation with Pooled OLS?
    4. What command can I use to check for cross-sectional correlation?
    5. Should any of the above violation exist, would xtgls be appropriate to address it? I understand that xtgls works better for the T>N datasets so I'm unsure what to do.
    Thank you!

  • #2
    Trung:
    1) as per -xtreg,re- results, you do not have a panel-wise effect;
    2) your code for pooled OLS is correct;
    3) -hettest- would work too;
    4) you need a to -tsset- your data before -xtserial- after -regress-;
    5) no, as -xtgls-, as you correctly specifed, is for T>N panel datasets.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      A few things. First, the xttest0 is almost useless in modern panel data analysis. It rejects when there is positive serial correlation -- whether or not the source of it is a "random effect." It will reject if you have AR(1) errors. Second, if you have negative serial correlation in the e(i,t) (on average), you can easily estimate zero for the error variance. That's what's happening in Trung's application. This does not mean there is no unobserved firm heterogeneity. It could just as well be that there's negative serial correlation. The default should be that you try fixed effects.

      xtserial uses the first differenced equation to test for serial correlation, not the levels. But there's no need to test for serial correlation. All such tests rely on asymptotic analysis. If you're going to rely on that, you might as well cluster your standard errors at the firm level -- even though N = 26 is pretty small for clustering.

      My suggestion is to use fixed effects. One reason you might be finding negative serial correlation is because you don't have year dummies in your model. In economics, that's practically a requirement because you need to control for secular changes. So, I recommend so-called two-way fixed effects with clustered standard errors. Don't test for heteroskedasticity or serial correlation or cross-sectional correlation.

      Comment


      • #4
        Thank you Mr. Lazzaro and Mr. Wooldridge for your reply! I will take your feedback into consideration to improve my model.

        Although I would like to ask Mr. Wooldridge one more question for confirmation. Should I understand your suggestion right, my model would now be modified to:
        Code:
        xtreg roa fo size lage leverage div_payout asset_turnover i.year, fe cluster(code)
        and that the above command would address the potential problems of heteroskedasticity/serial correlation/cross-sectional correlation (because clustering would address the heteroskedasticity, -fe- option would address the cross-sectional correlation, and i.year would address the serial correlation)?

        Again, thank you so much for your help!

        Comment


        • #5
          Hi everyone,

          I'm trying to reproduce a model of Pooled regression with standard errors clustered at the firm and year level that includes year-month fixed effects and a industry fixed effects.

          The dependant variable is the yearly return of different firms (350), each firm attached to 1 industry through 9 Indusrtry's categories. For each firm, 6 observations (1 each year).

          So inside each undustry I have different units (firms) that means multiple observations / industry / year. So my pannel doesn't work...

          "repeated time values in sample" which makes sense...

          Considering that, instead of a time series pannel I grouped 2 variables : Industry * year and controlled for the effect.

          Even if it's working, it implies two problems :

          1. I don't know how to check for Autocorrelation, incapable (again) to define the Time variable (*tsset*)
          2. my model is deferent from the paper's model...

          I'm sorry if my questions appear to be very simple...

          Thank you for your help

          Comment


          • #6
            Melanie:
            welcome to this forum.
            If, for any reason, you want to go pooled OLS, there's no need to -xtset- your dataset.
            That said, if you really have panel data, it would be much more helpful to compare -xtreg,fe- vs -xrtreg,re-..
            It would be more informative to follow the FAQ and share what you typed and what Stata gave you back (as per FAQ). Thanks.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment

            Working...
            X