Panel data with Pooled OLS regression and testing

Trung Phan

Join Date: Nov 2023
Posts: 2

Panel data with Pooled OLS regression and testing

18 Nov 2023, 08:31

Dear All,

I am running a panel dataset, with N = 26 and T = 13 (balanced, no missing data). I checked to see should I regress the model with REM/FEM, but the p-value is insignificant so I believe it's better with Pooled OLS?

Code:

xtreg roa fo size lage leverage div_payout asset_turnover

Random-effects GLS regression                   Number of obs     =        338
Group variable: code                            Number of groups  =         26

R-squared:                                      Obs per group:
     Within  = 0.9786                                         min =         13
     Between = 0.9959                                         avg =       13.0
     Overall = 0.9828                                         max =         13

                                                Wald chi2(6)      =   18927.20
corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000

--------------------------------------------------------------------------------
           roa | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
---------------+----------------------------------------------------------------
            fo |   .0002094   .0006397     0.33   0.743    -.0010444    .0014633
          size |  -.0002191   .0000968    -2.26   0.024    -.0004088   -.0000295
          lage |  -.0004467   .0001438    -3.11   0.002    -.0007286   -.0001648
      leverage |   .0016867   .0018883     0.89   0.372    -.0020144    .0053878
    div_payout |   .0004386   .0000608     7.22   0.000     .0003195    .0005578
asset_turnover |   1.108709   .0102544   108.12   0.000      1.08861    1.128807
         _cons |   .0063234   .0024668     2.56   0.010     .0014884    .0111583
---------------+----------------------------------------------------------------
       sigma_u |          0
       sigma_e |  .00108588
           rho |          0   (fraction of variance due to u_i)
--------------------------------------------------------------------------------

Code:

xttest0

        Estimated results:
                         |       Var     SD = sqrt(Var)
                ---------+-----------------------------
                     roa |   .0000706       .0084049
                       e |   1.18e-06       .0010859
                       u |          0              0

        Test: Var(u) = 0
                             chibar2(01) =     0.00
                          Prob &gt; chibar2 =   1.0000

Since I do not have much experience with Pooled OLS (I worked with xtreg, re/fe more frequent), I would like to ask:

Is the correct command for Pooled OLS is?:

Code:

reg roa fo size lage leverage div_payout asset_turnover, vce (cluster code)

If I want to test for heteroskedasticity, would imtest, white work?
Does xtserial work in this case of checking for serial correlation with Pooled OLS?
What command can I use to check for cross-sectional correlation?
Should any of the above violation exist, would xtgls be appropriate to address it? I understand that xtgls works better for the T>N datasets so I'm unsure what to do.

Thank you!

Tags: panel, panel data, Pooled OLS

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17719
#2

18 Nov 2023, 09:26

Trung:
1) as per -xtreg,re- results, you do not have a panel-wise effect;
2) your code for pooled OLS is correct;
3) -hettest- would work too;
4) you need a to -tsset- your data before -xtserial- after -regress-;
5) no, as -xtgls-, as you correctly specifed, is for T>N panel datasets.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2183
#3

18 Nov 2023, 09:54

A few things. First, the xttest0 is almost useless in modern panel data analysis. It rejects when there is positive serial correlation -- whether or not the source of it is a "random effect." It will reject if you have AR(1) errors. Second, if you have negative serial correlation in the e(i,t) (on average), you can easily estimate zero for the error variance. That's what's happening in Trung's application. This does not mean there is no unobserved firm heterogeneity. It could just as well be that there's negative serial correlation. The default should be that you try fixed effects.

xtserial uses the first differenced equation to test for serial correlation, not the levels. But there's no need to test for serial correlation. All such tests rely on asymptotic analysis. If you're going to rely on that, you might as well cluster your standard errors at the firm level -- even though N = 26 is pretty small for clustering.

My suggestion is to use fixed effects. One reason you might be finding negative serial correlation is because you don't have year dummies in your model. In economics, that's practically a requirement because you need to control for secular changes. So, I recommend so-called two-way fixed effects with clustered standard errors. Don't test for heteroskedasticity or serial correlation or cross-sectional correlation.
2 likes
Comment
Trung Phan

Join Date: Nov 2023

Posts: 2
#4

18 Nov 2023, 19:49

Thank you Mr. Lazzaro and Mr. Wooldridge for your reply! I will take your feedback into consideration to improve my model.

Although I would like to ask Mr. Wooldridge one more question for confirmation. Should I understand your suggestion right, my model would now be modified to:

Code:

xtreg roa fo size lage leverage div_payout asset_turnover i.year, fe cluster(code)

and that the above command would address the potential problems of heteroskedasticity/serial correlation/cross-sectional correlation (because clustering would address the heteroskedasticity, -fe- option would address the cross-sectional correlation, and i.year would address the serial correlation)?

Again, thank you so much for your help!
Comment
Melanie Dupont

Join Date: Dec 2023

Posts: 1
#5

21 Dec 2023, 12:51

Hi everyone,

I'm trying to reproduce a model of Pooled regression with standard errors clustered at the firm and year level that includes year-month fixed effects and a industry fixed effects.

The dependant variable is the yearly return of different firms (350), each firm attached to 1 industry through 9 Indusrtry's categories. For each firm, 6 observations (1 each year).

So inside each undustry I have different units (firms) that means multiple observations / industry / year. So my pannel doesn't work...

"repeated time values in sample" which makes sense...

Considering that, instead of a time series pannel I grouped 2 variables : Industry * year and controlled for the effect.

Even if it's working, it implies two problems :

1. I don't know how to check for Autocorrelation, incapable (again) to define the Time variable (*tsset*)
2. my model is deferent from the paper's model...

I'm sorry if my questions appear to be very simple...

Thank you for your help
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17719
#6

22 Dec 2023, 10:41

Melanie:
welcome to this forum.
If, for any reason, you want to go pooled OLS, there's no need to -xtset- your dataset.
That said, if you really have panel data, it would be much more helpful to compare -xtreg,fe- vs -xrtreg,re-..
It would be more informative to follow the FAQ and share what you typed and what Stata gave you back (as per FAQ). Thanks.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Panel data with Pooled OLS regression and testing

Comment

Comment

Comment

Comment

Comment