Robust Hausman Test for TWFE

Chaeyeon Song

Join Date: Feb 2023

Posts: 6
#1

Robust Hausman Test for TWFE

12 Feb 2023, 22:52

Hello, I am currently writing a thesis on the relationship between public family spending and suicide rate. I have encountered some issues while running the regressions. Based on the data structure and limited number of clusters that I have, I will need to use a bootstrap standard error. The problems are:
With regular hausman command in STATA, I can't use cluster robust standard error or bootstrap standard error.

With a community-created command "xtoverid", I can use cluster robust standard error but not the bootstrap standard error.

There is another community-created command "rhausman" which incorporates bootstrapping method to the hausman test, but this one can't be used for two-way fixed effects (TWFE) regression since it does not allow factor variables

Is there any advice on how to proceed?

Thanks!

Last edited by Chaeyeon Song; 12 Feb 2023, 22:54. Reason: TWFE
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17685
#2

13 Feb 2023, 00:12

Chaeyon:
welcome to this forum.
If you have less than (at least) 30 clusters, cluster-robust standard errors may be (highly) misleading and more biased than their default counterparts.

Last edited by Carlo Lazzaro; 13 Feb 2023, 00:18.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Chaeyeon Song

Join Date: Feb 2023

Posts: 6
#3

13 Feb 2023, 00:26

Hi Carlo,

Thanks for the reply!

I have 23 clusters, and this is the reason why I am trying to use the bootstrap standard errors. For Hausman test then, do you recommend using normal "hausman" command instead of xtoverid or rhausman even if the standard errors are heteroskedastic and serially correlated for sure?

Last edited by Chaeyeon Song; 13 Feb 2023, 01:07.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17685
#4

13 Feb 2023, 05:04

Chaeyon:
could you please share what you typed and what Stata gave you back with default and clustered-robust standard errors? Thanks.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Chaeyeon Song

Join Date: Feb 2023

Posts: 6
#5

14 Feb 2023, 21:10

. xtreg suicide_t_t family i.year, fe

Fixed-effects (within) regression Number of obs = 506
Group variable: id Number of groups = 23

R-squared: Obs per group:
Within = 0.0611 min = 22
Between = 0.0105 avg = 22.0
Overall = 0.0151 max = 22

F(22,461) = 1.36
corr(u_i, Xb) = -0.2312 Prob > F = 0.1259

------------------------------------------------------------------------------
suicide_t_t | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
family | 1.655328 .4017802 4.12 0.000 .8657804 2.444876
|
year |
1994 | .0380431 .7078393 0.05 0.957 -1.352948 1.429034
1995 | -.0042615 .7078454 -0.01 0.995 -1.395265 1.386742
1996 | -.0464792 .7079986 -0.07 0.948 -1.437784 1.344825
1997 | .1605435 .7083025 0.23 0.821 -1.231358 1.552445
1998 | -.0510126 .708509 -0.07 0.943 -1.44332 1.341295
1999 | -.5459157 .7082766 -0.77 0.441 -1.937766 .8459351
2000 | -.46362 .7079426 -0.65 0.513 -1.854815 .9275744
2001 | -.5073262 .70845 -0.72 0.474 -1.899518 .8848652
2002 | -.4624964 .7100076 -0.65 0.515 -1.857749 .932756
2003 | -.9951192 .7138789 -1.39 0.164 -2.397979 .4077408
2004 | -.8109109 .7124466 -1.14 0.256 -2.210956 .5891345
2005 | -1.041151 .712626 -1.46 0.145 -2.441549 .3592468
2006 | -1.25981 .7132401 -1.77 0.078 -2.661415 .1417942
2007 | -1.330958 .7134689 -1.87 0.063 -2.733012 .0710966
2008 | -1.560678 .7240807 -2.16 0.032 -2.983586 -.1377701
2009 | -1.50991 .7388921 -2.04 0.042 -2.961924 -.0578963
2010 | -1.540234 .7352847 -2.09 0.037 -2.985159 -.0953095
2011 | -1.706913 .7294929 -2.34 0.020 -3.140456 -.2733692
2012 | -1.722737 .7313306 -2.36 0.019 -3.159892 -.2855823
2013 | -1.694522 .7304072 -2.32 0.021 -3.129862 -.2591816
2014 | -1.712957 .7274553 -2.35 0.019 -3.142496 -.2834181
|
_cons | 10.72773 .8739664 12.27 0.000 9.010282 12.44519
-------------+----------------------------------------------------------------
sigma_u | 4.9654088
sigma_e | 2.4001944
rho | .81059666 (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(22, 461) = 88.87 Prob > F = 0.0000

. xtreg suicide_t_t family i.year, fe vce(bootstrap)
(running xtreg on estimation sample)

Bootstrap replications (50)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.................................................. 50

Fixed-effects (within) regression Number of obs = 506
Group variable: id Number of groups = 23

R-squared: Obs per group:
Within = 0.0611 min = 22
Between = 0.0105 avg = 22.0
Overall = 0.0151 max = 22

Wald chi2(22) = 366.24
corr(u_i, Xb) = -0.2312 Prob > chi2 = 0.0000

(Replications based on 23 clusters in id)
------------------------------------------------------------------------------
| Observed Bootstrap Normal-based
suicide_t_t | coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
family | 1.655328 .8948897 1.85 0.064 -.0986236 3.40928
|
year |
1994 | .0380431 .2514234 0.15 0.880 -.4547378 .530824
1995 | -.0042615 .2799368 -0.02 0.988 -.5529274 .5444045
1996 | -.0464792 .3466707 -0.13 0.893 -.7259413 .6329829
1997 | .1605435 .4193793 0.38 0.702 -.6614249 .982512
1998 | -.0510126 .7046077 -0.07 0.942 -1.432018 1.329993
1999 | -.5459157 .6674996 -0.82 0.413 -1.854191 .7623596
2000 | -.46362 .7356451 -0.63 0.529 -1.905458 .9782178
2001 | -.5073262 .6119256 -0.83 0.407 -1.706678 .6920259
2002 | -.4624964 .6804209 -0.68 0.497 -1.796097 .8711041
2003 | -.9951192 .8825407 -1.13 0.260 -2.724867 .7346288
2004 | -.8109109 .8105235 -1.00 0.317 -2.399508 .777686
2005 | -1.041151 .9031191 -1.15 0.249 -2.811232 .7289298
2006 | -1.25981 .7682959 -1.64 0.101 -2.765643 .246022
2007 | -1.330958 .8447654 -1.58 0.115 -2.986668 .3247521
2008 | -1.560678 .9417928 -1.66 0.097 -3.406558 .2852022
2009 | -1.50991 1.087543 -1.39 0.165 -3.641455 .6216346
2010 | -1.540234 1.101551 -1.40 0.162 -3.699234 .618765
2011 | -1.706913 1.095263 -1.56 0.119 -3.853589 .4397639
2012 | -1.722737 1.044844 -1.65 0.099 -3.770594 .3251194
2013 | -1.694522 1.079785 -1.57 0.117 -3.810861 .4218175
2014 | -1.712957 1.057852 -1.62 0.105 -3.78631 .360395
|
_cons | 10.72773 1.679682 6.39 0.000 7.435618 14.01985
-------------+----------------------------------------------------------------
sigma_u | 4.9654088
sigma_e | 2.4001944
rho | .81059666 (fraction of variance due to u_i)
------------------------------------------------------------------------------

. xtreg suicide_t_t family i.year, fe cluster(id)

Fixed-effects (within) regression Number of obs = 506
Group variable: id Number of groups = 23

R-squared: Obs per group:
Within = 0.0611 min = 22
Between = 0.0105 avg = 22.0
Overall = 0.0151 max = 22

F(22,22) = 23.84
corr(u_i, Xb) = -0.2312 Prob > F = 0.0000

(Std. err. adjusted for 23 clusters in id)
------------------------------------------------------------------------------
| Robust
suicide_t_t | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
family | 1.655328 1.007151 1.64 0.114 -.4333761 3.744032
|
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17685
#6

14 Feb 2023, 23:36

Chaeyeon:
1) the main issue here rests on the vey low within Rsq of your -fe- regression, that calls for a test of the correctness of the functional form of the regressand;
2) that, said I would stick with default standatd errors, as your number of clusters is too small to make non- default standard errors reliable (and all in all your results do not change that much regardless of the type of standard errors you invoke);
3) in your future posts, please use CODE delimiters to share what you typed and what Stata gave you back. Thanks.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Chaeyeon Song

Join Date: Feb 2023

Posts: 6
#7

15 Feb 2023, 09:26

Hello,

Thanks a lot for the feedback.

I do agree that Rsq values are very low in those models. When I include controlled variables, they increase up to 0.3~0.4, which are similar to what have been found in the previous literatures. I will definitely consider non-linear models, however.

Regarding the second point, it seems to me that the default standard errors are biased downwards to a great extent, which is why I am still considering cluster robust SE for hausman test. Also, is there any reason why I should not use bootstrap standard errors (it's not for hausman, but still...)?

Regards,
Chaeyeon
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17685
#8

15 Feb 2023, 09:43

Chaeyeon:
with a low number of clusters, the cluster-robust standard errors can be misleading as well (and it is difficult/impossible to assess which one perform worse than its counterparts).
No restriction in using boostrapped standar errors (especially for those regression commands that do not allow clustered standard errors). However, in your case the bootstrap replications are also based on the limited number of clusters-

Kind regards,
Carlo
(Stata 19.0)
Comment
Chaeyeon Song

Join Date: Feb 2023

Posts: 6
#9

15 Feb 2023, 18:28

Hi,

Considering that both default option and cluster robust option are likely to be biased, what's the logic behind choosing the default one especially when the bootstrapped SEs are much similar to the cluster robust SEs?

Regards,
Chaeyeon
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17685
#10

16 Feb 2023, 00:52

Chaeyeon:
see https://cameron.econ.ucdavis.edu/res...5_February.pdf

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Robust Hausman Test for TWFE

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment