Panel Data Regression Steps

Hidayat Hadi

Join Date: Dec 2020
Posts: 2

Panel Data Regression Steps

22 Dec 2020, 06:15

Dear Statalist,

I am new to panel data regression analysis and Stata, so forgive me if my questions are too basic.
Currently i am conducting a research on company profitability with ROA as dependent variable and Liquidity, Log Total Asset, Leverage, and Asset Structure as independent variables. The data has 34 companies across 5 years with 170 total observations.

Code:

sum roa liquidity size leverage assetstructure

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
         roa |       170    .0177706    .1900833     -1.538       .456
   liquidity |       170    1.910747    1.954133       .011     20.167
        size |       170    12.81888    .6328861      10.76      14.01
    leverage |       170    1.254612    2.414883    -15.817     13.152
assetstruc~e |       170    .6500412    .1825059       .106       .996

First, i conducted Breusch-Pagan test that showed Prob>chibar2 = 0.00 which means that RE is chosen over OLS.

Code:

xtreg roa liquidity size leverage assetstructure, re
xttest0

Then I ran Hausman test that returned prob>chibar2 = 0.00 meaning FE is more suitable than RE.

Code:

hausman fe re

As for the tests, I checked my data for multicollinearity using vif and it showed that my data doesn't have multicollinearity problems. Then i checked for autocorrelation using xtserial and heteroscedasticity using xttest3 which shows that my data suffers from both problems.

Code:

 xtserial roa liquidity size leverage assetstructure

Wooldridge test for autocorrelation in panel data
H0: no first-order autocorrelation
    F(  1,      33) =      4.472
           Prob > F =      0.0421

Code:

xtreg roa liquidity size leverage assetstructure, fe

Fixed-effects (within) regression               Number of obs      =       170
Group variable: code                            Number of groups   =        34

R-sq:  within  = 0.2890                         Obs per group: min =         5
       between = 0.0425                                        avg =       5.0
       overall = 0.0516                                        max =         5

                                                F(4,132)           =     13.41
corr(u_i, Xb)  = -0.8918                        Prob > F           =    0.0000

--------------------------------------------------------------------------------
           roa |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------------+----------------------------------------------------------------
     liquidity |   .0184212    .007616     2.42   0.017      .003356    .0334865
          size |   .4378722   .0898747     4.87   0.000     .2600913    .6156532
      leverage |   .0204201   .0049537     4.12   0.000     .0106212     .030219
assetstructure |   .5211536   .1530098     3.41   0.001     .2184851    .8238221
         _cons |  -5.994851   1.186008    -5.05   0.000    -8.340892    -3.64881
---------------+----------------------------------------------------------------
       sigma_u |  .32016264
       sigma_e |  .13284938
           rho |  .85311272   (fraction of variance due to u_i)
--------------------------------------------------------------------------------
F test that all u_i=0:     F(33, 132) =     5.51             Prob > F = 0.0000

. xttest3

Modified Wald test for groupwise heteroskedasticity
in fixed effect regression model

H0: sigma(i)^2 = sigma^2 for all i

chi2 (34)  =   77949.85
Prob>chi2 =      0.0000

After reading other questions on this site, i found that you can deal with autocorrelation and heteroscedasticity by clustering the standard errors. So i did the fixed effect regression using the cluster commands:

Code:

xtreg roa liquidity size leverage assetstructure, fe cluster(code)

Fixed-effects (within) regression               Number of obs      =       170
Group variable: code                            Number of groups   =        34

R-sq:  within  = 0.2890                         Obs per group: min =         5
       between = 0.0425                                        avg =       5.0
       overall = 0.0516                                        max =         5

                                                F(4,33)            =      2.70
corr(u_i, Xb)  = -0.8918                        Prob > F           =    0.0472

                                    (Std. Err. adjusted for 34 clusters in code)
--------------------------------------------------------------------------------
               |               Robust
           roa |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------------+----------------------------------------------------------------
     liquidity |   .0184212   .0091547     2.01   0.052    -.0002042    .0370467
          size |   .4378722   .1960061     2.23   0.032     .0390949    .8366495
      leverage |   .0204201   .0066806     3.06   0.004     .0068284    .0340118
assetstructure |   .5211536   .3114909     1.67   0.104    -.1125794    1.154887
         _cons |  -5.994851   2.717433    -2.21   0.034    -11.52351   -.4661927
---------------+----------------------------------------------------------------
       sigma_u |  .32016264
       sigma_e |  .13284938
           rho |  .85311272   (fraction of variance due to u_i)
--------------------------------------------------------------------------------
(code)

my questions are:
1. Are my steps correct?
2. Do I need check my data with other tests?

Regards,
Hadi

Tags: data, fixed effects, panel data, regression

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17712

22 Dec 2020, 08:58

Hadi:
welcome to this forum.
the only step I would amend is invoking non-default standard errors after -hausman- test.
Hence, since -hausman- does not support non-default standard errors, you shoud check for the best specification via the community-contributed command -xtoverid- (just type -search xtoverid- to spot and install it).
As the null of -xtoverid- is that -re- is the way to go, there's no need to run and store -xtreg,fe- too, as -xtreg,re- will suffice, as you can see in the following toy-example (please note that I've already installed -xtoverid- so I can skip downloading it):

Code:

use "https://www.stata-press.com/data/r16/nlswork.dta"
. xi: xtreg ln_wage i.race grade , vce(cluster idcode)
i.race            _Irace_1-3          (naturally coded; _Irace_1 omitted)

Random-effects GLS regression                   Number of obs     =     28,532
Group variable: idcode                          Number of groups  =      4,709

R-sq:                                           Obs per group:
     within  = 0.0000                                         min =          1
     between = 0.3170                                         avg =        6.1
     overall = 0.1970                                         max =         15

                                                Wald chi2(3)      =    1864.97
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

                             (Std. Err. adjusted for 4,709 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    _Irace_2 |  -.0459478   .0107442    -4.28   0.000     -.067006   -.0248895
    _Irace_3 |   .1039604   .0511939     2.03   0.042     .0036222    .2042986
       grade |   .0909639   .0021824    41.68   0.000     .0866865    .0952413
       _cons |   .5143082   .0285046    18.04   0.000     .4584403    .5701761
-------------+----------------------------------------------------------------
     sigma_u |  .30393641
     sigma_e |  .32028665
         rho |    .473825   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. xtoverid

Test of overidentifying restrictions: fixed vs random effects
Cross-section time-series model: xtreg re  robust cluster(idcode)
Sargan-Hansen statistic 1864.974  Chi-sq(3)   P-value = 0.0000

.

Two sidelights as I'm finishing my reply off:
1) as many glorious but a but a bit old-fahioned community-contributed Stata commands, -xtoverid- does not support -fvvarlist- notation; hence, you should prefix the whole code with -xi:- if one or more of your predictors are categorical;
2) the results of -xtoverid- points you toward the -fe- specification.

Kind regards,
Carlo
(Stata 19.0)

Comment

Hidayat Hadi

Join Date: Dec 2020

Posts: 2
#3

23 Dec 2020, 00:55

Carlo:
Thank you for the reply.
So, in this case xtoverid will replace Hausman as FE/RE test, right?
I have one more question if you don't mind. When I use clustered standard errors, Stata does not give me the F test at the bottom of the result like in the regular -xtreg fe- does. Is there something that I missed?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#4

23 Dec 2020, 01:28

Hidayat:
1) correct;
2) -xtreg- behaves as expected, as the F-tests statistic is too complex to compute with non-default standard errors (see Example 3. -xtreg- entry, Stata .pdf manual).

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Panel Data Regression Steps

Comment

Comment

Comment