Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Robust Hausman Test for TWFE

    Hello, I am currently writing a thesis on the relationship between public family spending and suicide rate. I have encountered some issues while running the regressions. Based on the data structure and limited number of clusters that I have, I will need to use a bootstrap standard error. The problems are:
    1. With regular hausman command in STATA, I can't use cluster robust standard error or bootstrap standard error.
    2. With a community-created command "xtoverid", I can use cluster robust standard error but not the bootstrap standard error.
    3. There is another community-created command "rhausman" which incorporates bootstrapping method to the hausman test, but this one can't be used for two-way fixed effects (TWFE) regression since it does not allow factor variables
    Is there any advice on how to proceed?

    Thanks!
    Last edited by Chaeyeon Song; 12 Feb 2023, 22:54. Reason: TWFE

  • #2
    Chaeyon:
    welcome to this forum.
    If you have less than (at least) 30 clusters, cluster-robust standard errors may be (highly) misleading and more biased than their default counterparts.
    Last edited by Carlo Lazzaro; 13 Feb 2023, 00:18.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hi Carlo,

      Thanks for the reply!

      I have 23 clusters, and this is the reason why I am trying to use the bootstrap standard errors. For Hausman test then, do you recommend using normal "hausman" command instead of xtoverid or rhausman even if the standard errors are heteroskedastic and serially correlated for sure?
      Last edited by Chaeyeon Song; 13 Feb 2023, 01:07.

      Comment


      • #4
        Chaeyon:
        could you please share what you typed and what Stata gave you back with default and clustered-robust standard errors? Thanks.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          . xtreg suicide_t_t family i.year, fe

          Fixed-effects (within) regression Number of obs = 506
          Group variable: id Number of groups = 23

          R-squared: Obs per group:
          Within = 0.0611 min = 22
          Between = 0.0105 avg = 22.0
          Overall = 0.0151 max = 22

          F(22,461) = 1.36
          corr(u_i, Xb) = -0.2312 Prob > F = 0.1259

          ------------------------------------------------------------------------------
          suicide_t_t | Coefficient Std. err. t P>|t| [95% conf. interval]
          -------------+----------------------------------------------------------------
          family | 1.655328 .4017802 4.12 0.000 .8657804 2.444876
          |
          year |
          1994 | .0380431 .7078393 0.05 0.957 -1.352948 1.429034
          1995 | -.0042615 .7078454 -0.01 0.995 -1.395265 1.386742
          1996 | -.0464792 .7079986 -0.07 0.948 -1.437784 1.344825
          1997 | .1605435 .7083025 0.23 0.821 -1.231358 1.552445
          1998 | -.0510126 .708509 -0.07 0.943 -1.44332 1.341295
          1999 | -.5459157 .7082766 -0.77 0.441 -1.937766 .8459351
          2000 | -.46362 .7079426 -0.65 0.513 -1.854815 .9275744
          2001 | -.5073262 .70845 -0.72 0.474 -1.899518 .8848652
          2002 | -.4624964 .7100076 -0.65 0.515 -1.857749 .932756
          2003 | -.9951192 .7138789 -1.39 0.164 -2.397979 .4077408
          2004 | -.8109109 .7124466 -1.14 0.256 -2.210956 .5891345
          2005 | -1.041151 .712626 -1.46 0.145 -2.441549 .3592468
          2006 | -1.25981 .7132401 -1.77 0.078 -2.661415 .1417942
          2007 | -1.330958 .7134689 -1.87 0.063 -2.733012 .0710966
          2008 | -1.560678 .7240807 -2.16 0.032 -2.983586 -.1377701
          2009 | -1.50991 .7388921 -2.04 0.042 -2.961924 -.0578963
          2010 | -1.540234 .7352847 -2.09 0.037 -2.985159 -.0953095
          2011 | -1.706913 .7294929 -2.34 0.020 -3.140456 -.2733692
          2012 | -1.722737 .7313306 -2.36 0.019 -3.159892 -.2855823
          2013 | -1.694522 .7304072 -2.32 0.021 -3.129862 -.2591816
          2014 | -1.712957 .7274553 -2.35 0.019 -3.142496 -.2834181
          |
          _cons | 10.72773 .8739664 12.27 0.000 9.010282 12.44519
          -------------+----------------------------------------------------------------
          sigma_u | 4.9654088
          sigma_e | 2.4001944
          rho | .81059666 (fraction of variance due to u_i)
          ------------------------------------------------------------------------------
          F test that all u_i=0: F(22, 461) = 88.87 Prob > F = 0.0000

          . xtreg suicide_t_t family i.year, fe vce(bootstrap)
          (running xtreg on estimation sample)

          Bootstrap replications (50)
          ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
          .................................................. 50

          Fixed-effects (within) regression Number of obs = 506
          Group variable: id Number of groups = 23

          R-squared: Obs per group:
          Within = 0.0611 min = 22
          Between = 0.0105 avg = 22.0
          Overall = 0.0151 max = 22

          Wald chi2(22) = 366.24
          corr(u_i, Xb) = -0.2312 Prob > chi2 = 0.0000

          (Replications based on 23 clusters in id)
          ------------------------------------------------------------------------------
          | Observed Bootstrap Normal-based
          suicide_t_t | coefficient std. err. z P>|z| [95% conf. interval]
          -------------+----------------------------------------------------------------
          family | 1.655328 .8948897 1.85 0.064 -.0986236 3.40928
          |
          year |
          1994 | .0380431 .2514234 0.15 0.880 -.4547378 .530824
          1995 | -.0042615 .2799368 -0.02 0.988 -.5529274 .5444045
          1996 | -.0464792 .3466707 -0.13 0.893 -.7259413 .6329829
          1997 | .1605435 .4193793 0.38 0.702 -.6614249 .982512
          1998 | -.0510126 .7046077 -0.07 0.942 -1.432018 1.329993
          1999 | -.5459157 .6674996 -0.82 0.413 -1.854191 .7623596
          2000 | -.46362 .7356451 -0.63 0.529 -1.905458 .9782178
          2001 | -.5073262 .6119256 -0.83 0.407 -1.706678 .6920259
          2002 | -.4624964 .6804209 -0.68 0.497 -1.796097 .8711041
          2003 | -.9951192 .8825407 -1.13 0.260 -2.724867 .7346288
          2004 | -.8109109 .8105235 -1.00 0.317 -2.399508 .777686
          2005 | -1.041151 .9031191 -1.15 0.249 -2.811232 .7289298
          2006 | -1.25981 .7682959 -1.64 0.101 -2.765643 .246022
          2007 | -1.330958 .8447654 -1.58 0.115 -2.986668 .3247521
          2008 | -1.560678 .9417928 -1.66 0.097 -3.406558 .2852022
          2009 | -1.50991 1.087543 -1.39 0.165 -3.641455 .6216346
          2010 | -1.540234 1.101551 -1.40 0.162 -3.699234 .618765
          2011 | -1.706913 1.095263 -1.56 0.119 -3.853589 .4397639
          2012 | -1.722737 1.044844 -1.65 0.099 -3.770594 .3251194
          2013 | -1.694522 1.079785 -1.57 0.117 -3.810861 .4218175
          2014 | -1.712957 1.057852 -1.62 0.105 -3.78631 .360395
          |
          _cons | 10.72773 1.679682 6.39 0.000 7.435618 14.01985
          -------------+----------------------------------------------------------------
          sigma_u | 4.9654088
          sigma_e | 2.4001944
          rho | .81059666 (fraction of variance due to u_i)
          ------------------------------------------------------------------------------

          . xtreg suicide_t_t family i.year, fe cluster(id)

          Fixed-effects (within) regression Number of obs = 506
          Group variable: id Number of groups = 23

          R-squared: Obs per group:
          Within = 0.0611 min = 22
          Between = 0.0105 avg = 22.0
          Overall = 0.0151 max = 22

          F(22,22) = 23.84
          corr(u_i, Xb) = -0.2312 Prob > F = 0.0000

          (Std. err. adjusted for 23 clusters in id)
          ------------------------------------------------------------------------------
          | Robust
          suicide_t_t | Coefficient std. err. t P>|t| [95% conf. interval]
          -------------+----------------------------------------------------------------
          family | 1.655328 1.007151 1.64 0.114 -.4333761 3.744032
          |

          Comment


          • #6
            Chaeyeon:
            1) the main issue here rests on the vey low within Rsq of your -fe- regression, that calls for a test of the correctness of the functional form of the regressand;
            2) that, said I would stick with default standatd errors, as your number of clusters is too small to make non- default standard errors reliable (and all in all your results do not change that much regardless of the type of standard errors you invoke);
            3) in your future posts, please use CODE delimiters to share what you typed and what Stata gave you back. Thanks.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Hello,

              Thanks a lot for the feedback.

              I do agree that Rsq values are very low in those models. When I include controlled variables, they increase up to 0.3~0.4, which are similar to what have been found in the previous literatures. I will definitely consider non-linear models, however.

              Regarding the second point, it seems to me that the default standard errors are biased downwards to a great extent, which is why I am still considering cluster robust SE for hausman test. Also, is there any reason why I should not use bootstrap standard errors (it's not for hausman, but still...)?

              Regards,
              Chaeyeon

              Comment


              • #8
                Chaeyeon:
                with a low number of clusters, the cluster-robust standard errors can be misleading as well (and it is difficult/impossible to assess which one perform worse than its counterparts).
                No restriction in using boostrapped standar errors (especially for those regression commands that do not allow clustered standard errors). However, in your case the bootstrap replications are also based on the limited number of clusters-
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Hi,

                  Considering that both default option and cluster robust option are likely to be biased, what's the logic behind choosing the default one especially when the bootstrapped SEs are much similar to the cluster robust SEs?

                  Regards,
                  Chaeyeon

                  Comment


                  • #10
                    Chaeyeon:
                    see https://cameron.econ.ucdavis.edu/res...5_February.pdf
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment

                    Working...
                    X