Problem with Hausman Test (already read 4 previous threads but still stuck)

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#16

06 Jun 2020, 00:29

Eddie:
yes, go -regress- with standard errors clustered on -panelid-, as the observations belonging to the same panel are not independent.

Kind regards,
Carlo
(Stata 19.0)
Comment

Eddie Mateosian

Join Date: Jun 2020
Posts: 18

#17

06 Jun 2020, 09:48

Thanks a lot!

Hopefully last two questions. In order to have better results I figured to generate the independent variables with 1 quarter lag and regress the dependent variable with the independent's prior quarter values. Do you think is a good idea?

Moreover, in the last regression where the dependent variable is Return on Assets (in percentage form) I got the following results:

Code:

. reg roa mv1 rev1 inv1 ltdebt1 empl1, cluster(CompanyID)

Linear regression                               Number of obs     =        265
                                                F(3, 4)           =          .
                                                Prob > F          =          .
                                                R-squared         =     0.4051
                                                Root MSE          =     .02799

                              (Std. Err. adjusted for 5 clusters in CompanyID)
------------------------------------------------------------------------------
             |               Robust
         roa |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mv1 |   4.81e-07   1.01e-07     4.77   0.009     2.01e-07    7.61e-07
        rev1 |  -1.80e-06   2.02e-07    -8.93   0.001    -2.36e-06   -1.24e-06
        inv1 |   2.86e-06   8.78e-07     3.26   0.031     4.25e-07    5.30e-06
     ltdebt1 |   2.06e-07   3.58e-07     0.58   0.595    -7.87e-07    1.20e-06
       empl1 |  -8.77e-06   .0000176    -0.50   0.644    -.0000576    .0000401
       _cons |    .030519   .0074747     4.08   0.015     .0097657    .0512722
------------------------------------------------------------------------------

. 
end of do-file

How could you explain the extremly low values of the coefficients?

Thank you in advance!!!

Comment

Eddie Mateosian

Join Date: Jun 2020

Posts: 18
#18

07 Jun 2020, 09:29

Anyone has any idea?
Comment
Eddie Mateosian

Join Date: Jun 2020

Posts: 18
#19

22 Jun 2020, 11:16

Mr Lazzaro,

I am trying to write down on my paper why exactly I cannot use the FE or RE on my data and I should rather use the pooled OLS as you suggested, but I cannot understand the exact reason behind it. Hausman test indicates an error and after that error you suggested to use xtoverid. Moreover, you suggested to use clustered standard errors. How all these related to each other?

Thank you in advance for your time!!!
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2156
#20

22 Jun 2020, 11:57

Eddie: You need to read an introduction to panel data methods. It looks like you have 52 weeks on five firms -- is that correct? That barely qualifies as panel data, as you have N = 5, T = 52. More importantly, you're trying to apply large N, small T methods to a very small N, large T setting.

There is no reason at all to think of using RE in this setting. It just doesn't fit. Use the user-written xtscc command to allow for serial correlation and include dummies for four of the five firms. Include the dummies if they are significant.

Code:

xtset CompanyID week xtscc y x1 x2 ... xK i.firm

Stata will automatically choose a Newey-West bandwidth for you, and there's no reason not to use it. Then test that the four firm dummies are significant.

By the way, you should rescale your explanatory variables. I suspect roa is a proportion; maybe a percent. But the other variables are probably measured in dollars or Euros or something like that. I'd define them in terms of millions of dollars, say.

Finally, you should never cluster with N = 5. Clustering needs a substantially large N to be justified.

JW
Comment
Eddie Mateosian

Join Date: Jun 2020

Posts: 18
#21

22 Jun 2020, 12:27

Thank you mr. Wooldridge for your response!

Previously in this post I was suggested to use the pooled OLS with clustered std.errors. I guess you imply that this method isn't good enough? The results are not bad, especially when I added some exogenous macroeconomic variables in a kind of a robustness check.

The explanatory variables are already in millions of dollars by the way. That's why this values seem odd to me
Comment

Eddie Mateosian

Join Date: Jun 2020
Posts: 18

#22

22 Jun 2020, 12:37

Complementary to my previous post, the first table contains the results of the method you suggested and the second the results of my regression after I included some other variables too.

Code:

. xtscc mv revch roa invch ltdebtch empl gdp infl comchng i.CompanyID

Regression with Driscoll-Kraay standard errors   Number of obs     =       265
Method: Pooled OLS                               Number of groups  =         5
Group variable (i): CompanyID                    F( 12,    56)     =    273.12
maximum lag: 3                                   Prob > F          =    0.0000
                                                 R-squared         =    0.8867
                                                 Root MSE          =   2.6e+04

------------------------------------------------------------------------------
             |             Drisc/Kraay
          mv |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       revch |   9638.864   7055.266     1.37   0.177    -4494.532    23772.26
         roa |   377695.3   53498.47     7.06   0.000       270525    484865.7
       invch |   15447.07   12705.81     1.22   0.229    -10005.73    40899.87
    ltdebtch |  -1257.886   645.6557    -1.95   0.056     -2551.29    35.51802
        empl |   271.7709   64.36855     4.22   0.000     142.8252    400.7166
         gdp |   276415.6   82086.81     3.37   0.001     111975.9    440855.2
        infl |   388381.7   571862.4     0.68   0.500    -757196.4     1533960
     comchng |  -69343.29    47348.5    -1.46   0.149    -164193.7    25507.16
             |
   CompanyID |
          1  |          0  (empty)
          2  |   7614.616   6086.724     1.25   0.216    -4578.558    19807.79
          3  |  -8492.987   14028.14    -0.61   0.547    -36594.73    19608.76
          4  |  -54956.28   13827.69    -3.97   0.000    -82656.47   -27256.09
          5  |    -369378   128456.8    -2.88   0.006    -626707.9   -112048.1
             |
       _cons |  -20646.85   12107.87    -1.71   0.094    -44901.82    3608.128
------------------------------------------------------------------------------

. 
end of do-file

Code:

. xtreg mv revch roa invch ltdebtch empl gdp infl comchng, cluster(CompanyID)

Random-effects GLS regression                   Number of obs     =        265
Group variable: CompanyID                       Number of groups  =          5

R-sq:                                           Obs per group:
     within  = 0.3282                                         min =         52
     between = 0.9523                                         avg =       53.0
     overall = 0.8270                                         max =         55

                                                Wald chi2(4)      =          .
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =          .

                              (Std. Err. adjusted for 5 clusters in CompanyID)
------------------------------------------------------------------------------
             |               Robust
          mv |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       revch |   19262.69   8575.594     2.25   0.025     2454.838    36070.55
         roa |   543566.1   181206.3     3.00   0.003     188408.3      898724
       invch |   8015.306   32146.12     0.25   0.803    -54989.93    71020.54
    ltdebtch |  -1252.499   1278.132    -0.98   0.327    -3757.592    1252.593
        empl |   84.56634   3.978322    21.26   0.000     76.76897    92.36371
         gdp |   382915.4   79073.84     4.84   0.000     227933.6    537897.3
        infl |   547319.1   241415.3     2.27   0.023      74153.7     1020484
     comchng |  -98317.21   26818.71    -3.67   0.000    -150880.9   -45753.51
       _cons |   7278.837   11204.96     0.65   0.516    -14682.48    29240.16
-------------+----------------------------------------------------------------
     sigma_u |          0
     sigma_e |  25567.577
         rho |          0   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. 
end of do-file

Comment

Jeff Wooldridge

Join Date: Apr 2014

Posts: 2156
#23

22 Jun 2020, 18:17

Neither RE estimation nor clustering can be justified with N = 5 and T = 52. That you get estimates that might seem plausible is neither here nor there. If that were how we evaluated statistical and econometric methods then the fields would be useless. For a given data set I can come up with innumerable terrible estimates that may give seemingly sensible estimators.

BTW, your coefficients now seem, aesthetically, too large. Maybe measure the original variables in billions. That’s somewhat common.
1 like
Comment
Eddie Mateosian

Join Date: Jun 2020

Posts: 18
#24

23 Jun 2020, 05:03

So you would say that my model generally lacks for a thesis?
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2156
#25

23 Jun 2020, 05:24

I’m not saying that. I’m suggesting what I think are suitable econometric methods. You need to use large T, small N. For estimation, OLS with firm dummies is fine. For standard errors, use xtscc, as you did. RE is not appropriate.
Comment
Eddie Mateosian

Join Date: Jun 2020

Posts: 18
#26

23 Jun 2020, 10:01

And the interpretation of these dummies is that for each company the aggregate effects of the indipendent variables to the dependent vary?
Comment
SABA GULNAZ

Join Date: Jun 2020

Posts: 14
#27

01 Jul 2020, 11:18

.

Last edited by SABA GULNAZ; 01 Jul 2020, 11:24.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment