Panel Data Tests and Regression Steps

Nicco Casa

Join Date: Apr 2023
Posts: 2

Panel Data Tests and Regression Steps

11 Apr 2023, 12:07

Dear Statalist,

I am new to panel data regression analysis and Stata, so please forgive my perhaps basic questions.
Context:
I am testing an asset pricing model with portfolio excess returns as the dependent variable and the market excess returns and an ESG factor as independent variables. Simplified, let's call: DV = portfolio return (Ri); IV1 = market factor (RmRf); IV2: ESG factor (ESG).
Portfolios are formed, so the data has 5 portfolios over 13 years with 65 total observations.

The first steps I took were to test the model assumptions (principally heteroskedasticity, multicollinearity, and autocorrelation).

Heteroskedasticity:

Code:

xtgls Ri RmRf ESG, igls panels(heteroskedastic)
estimates store hetero
xtgls Ri RmRf ESG, igls 
local df = e(N_g) - 1
lrtest hetero . , df(`df')

The test showed Prob > chi2 = 0.000, so the data are heteroskedastic.

FE vs RE:

Code:

xtreg Ri RmRf ESG, fe
estimate store fe
xtreg Ri RmRf ESG, re
estimate store re
hausman fe re

The test showed Prob > chi2 = 1.000, so RE will be used.

Multicollinearity: running the VIF yields average values of 1.01, so the IVs are not correlated.

Autocorrelation:

Code:

xtserial Ri RmRf ESG

The test showed Prob > F = 0.9308, so there is no autocorrelation in the error terms across panels.

In sum: the data are heteroskedastic. Therefore, I assume I can run panel regressions with robust standard errors using:

Code:

xtreg Ri RmRf ESG, robust

The resulting table:

Code:

Random-effects GLS regression                   Number of obs     =         65
Group variable: ID                              Number of groups  =          5

R-squared:                                      Obs per group:
     Within  = 0.0000                                         min =         13
     Between = 0.0000                                         avg =       13.0
     Overall = 0.7903                                         max =         13

                                                Wald chi2(2)      =      73.92
corr(u_i, X) = 0 (assumed)                      Prob > chi2       =     0.0000

                                     (Std. err. adjusted for 5 clusters in ID)
------------------------------------------------------------------------------
             |               Robust
          Ri | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        RmRf |   1.042645   .1598131     6.52   0.000     .7294168    1.355873
         ESG |    .104679   .2043089     0.51   0.608    -.2957591    .5051172
       _cons |   .0447226   .0077805     5.75   0.000     .0294731    .0599721
-------------+----------------------------------------------------------------
     sigma_u |          0
     sigma_e |  .10128858
         rho |          0   (fraction of variance due to u_i)

My questions are the following:

Are my steps correct?
Must I conduct additional tests?
What other methods may I use to evaluate the factor model?
Is it right to conclude that the RmRf factor and the intercept have statistically significant coefficients? (According to the xtreg output above)

My sincere appreciation for your time and expertise.
Best regards,
Nicco

Tags: None

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17742
#2

11 Apr 2023, 13:47

Nicco:
welcome to this forum.
If you have a T>N panel dataset, -xtreg- is not the way to go.
Take a look at -xtgls- (as you already did) and -xtregar-.

Kind regards,
Carlo
(Stata 19.0)
Comment
Nicco Casa

Join Date: Apr 2023

Posts: 2
#3

12 Apr 2023, 05:47

Dear Carlo,
Thank you for your prompt reply.

My understanding is that because I have a T>N panel dataset, where T=time periods=13 and N=sample size=5, then estimating the standard errors with the GLS method is better.
The appropriate code would then be:

Code:

xtgls y x1 x2, options

However, I read somewhere (cannot seem to find it, but will look further it needed) that (F)GLS performs only with a large enough sample due to it requiring a consistent estimate of the variance-covariance matrix. I also read that xtregar is used when there is autocorrelation in the error terms (https://www.stata.com/manuals13/xtxtregar.pdf). I don't think this is the case with my data, as the Prob > F = 0.9308 when I run:

Code:

xtserial Ri RmRf ESG

Can I not instead use robust standard errors with a simple panel regression?

Code:

xtreg y x1 x2, vce(robust)

Thank you in advance for your time.
Best regards,
Nicco
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17742
#4

12 Apr 2023, 08:26

NIcco:
I'd stick with -xtgls- that offers option for non-default standard errors.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Panel Data Tests and Regression Steps

Comment

Comment

Comment