Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Clustered standard errors at industry or firm level?

    Dear Stata Community

    I am aware that this question has come up a lot in this forum, yet I am not sure how to handle this problem in my case.
    In my master's thesis I am investigating whether import competition influences the prevalence of zombie firms. The regression model is as follows: Share Zombie Firms (by industry) = b*Import Penetration (by industry) + control variables (firm and industry level) +industry and year fixed effects. I do the analysis only for the manufacturing sector (two-digit sic code 20-38), so I have 19 different industries in total. Now the question of clustered standard errors arises. If I use clustered standard errors at the industry level, I get the following result:

    Code:
    areg share_zombiesBH2 L3.penetration L3.ln_at L3.age L3.F_E L3.tnic3hhi L3.dtfp4 i.year, absorb(sic) vce(cluster sic)
    
    Linear regression, absorbing indicators         Number of obs     =     10,760
                                                    F(  18,     18)   =          .
                                                    Prob > F          =          .
                                                    R-squared         =     0.5429
                                                    Adj R-squared     =     0.5411
                                                    Root MSE          =     0.0306
    
                                       (Std. Err. adjusted for 19 clusters in sic)
    ------------------------------------------------------------------------------
                 |               Robust
    share_zomb~2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
     penetration |
             L3. |   .3198444   .1800509     1.78   0.093    -.0584284    .6981172
                 |
           ln_at |
             L3. |   .0006393   .0004026     1.59   0.130    -.0002066    .0014852
                 |
             age |
             L3. |  -.0002218   .0001504    -1.48   0.157    -.0005377    .0000941
                 |
             F_E |
             L3. |   .0000837   .0000515     1.63   0.121    -.0000244    .0001918
                 |
        tnic3hhi |
             L3. |  -.0023026   .0011706    -1.97   0.065    -.0047619    .0001567
                 |
           dtfp4 |
             L3. |  -.0112122   .0162087    -0.69   0.498    -.0452654    .0228409
                 |
            year |
           1993  |  -.0084138     .00608    -1.38   0.183    -.0211874    .0043598
           1994  |  -.0195333   .0046546    -4.20   0.001    -.0293122   -.0097545
           1995  |  -.0151116    .004606    -3.28   0.004    -.0247884   -.0054348
           1996  |  -.0346192   .0071837    -4.82   0.000    -.0497115   -.0195269
           1997  |  -.0340963    .009619    -3.54   0.002     -.054305   -.0138877
           1998  |  -.0305902   .0137055    -2.23   0.039    -.0593843   -.0017961
           1999  |   -.017059   .0133948    -1.27   0.219    -.0452005    .0110825
           2000  |  -.0232423   .0109319    -2.13   0.048    -.0462093   -.0002753
           2001  |  -.0088355   .0147003    -0.60   0.555    -.0397197    .0220486
           2002  |   .0187259   .0149259     1.25   0.226    -.0126323     .050084
           2003  |    .019577   .0178908     1.09   0.288    -.0180101    .0571641
           2004  |   .0085283   .0183033     0.47   0.647    -.0299255    .0469821
           2005  |  -.0092883   .0166647    -0.56   0.584    -.0442996     .025723
           2006  |  -.0155196    .020882    -0.74   0.467     -.059391    .0283518
           2007  |  -.0187548   .0259719    -0.72   0.479    -.0733198    .0358103
           2008  |  -.0157858    .023747    -0.66   0.515    -.0656763    .0341048
           2009  |  -.0136269   .0212042    -0.64   0.529    -.0581753    .0309214
           2010  |  -.0205308   .0182965    -1.12   0.277    -.0589703    .0179087
           2011  |  -.0204554   .0232057    -0.88   0.390    -.0692087    .0282978
                 |
           _cons |   .1313362   .0277438     4.73   0.000     .0730486    .1896237
    -------------+----------------------------------------------------------------
             sic |   absorbed                                      (19 categories)
    If I, on the other hand, use clustered standard errors at the firm level, I get totally different results. As it can be seen t-values are much higher and I get more significant results. What would you recommend? Thank you very much for your help.

    Code:
      
    areg share_zombiesBH2 L3.penetration L3.ln_at L3.age L3.F_E L3.tnic3hhi L3.dtfp4 i.year, absorb(sic) vce(cluster gvkey)
    
    Linear regression, absorbing indicators         Number of obs     =     10,760
                                                    F(  25,   1445)   =     198.89
                                                    Prob > F          =     0.0000
                                                    R-squared         =     0.5429
                                                    Adj R-squared     =     0.5411
                                                    Root MSE          =     0.0306
    
                                  (Std. Err. adjusted for 1,446 clusters in gvkey)
    ------------------------------------------------------------------------------
                 |               Robust
    share_zomb~2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
     penetration |
             L3. |   .3198444   .0820098     3.90   0.000     .1589734    .4807154
                 |
           ln_at |
             L3. |   .0006393   .0002082     3.07   0.002     .0002308    .0010478
                 |
             age |
             L3. |  -.0002218   .0000835    -2.66   0.008    -.0003855   -.0000581
                 |
             F_E |
             L3. |   .0000837   .0000641     1.31   0.192    -.0000421    .0002094
                 |
        tnic3hhi |
             L3. |  -.0023026   .0013611    -1.69   0.091    -.0049725    .0003673
                 |
           dtfp4 |
             L3. |  -.0112122   .0032899    -3.41   0.001    -.0176656   -.0047588
                 |
            year |
           1993  |  -.0084138   .0010055    -8.37   0.000    -.0103862   -.0064414
           1994  |  -.0195333   .0010636   -18.37   0.000    -.0216197    -.017447
           1995  |  -.0151116   .0010935   -13.82   0.000    -.0172566   -.0129666
           1996  |  -.0346192   .0016817   -20.59   0.000     -.037918   -.0313204
           1997  |  -.0340963   .0023852   -14.30   0.000    -.0387751   -.0294176
           1998  |  -.0305902   .0030292   -10.10   0.000    -.0365323    -.024648
           1999  |   -.017059    .003114    -5.48   0.000    -.0231674   -.0109506
           2000  |  -.0232423   .0033798    -6.88   0.000     -.029872   -.0166125
           2001  |  -.0088355   .0038973    -2.27   0.024    -.0164805   -.0011906
           2002  |   .0187259   .0050066     3.74   0.000     .0089049    .0285469
           2003  |    .019577   .0060633     3.23   0.001     .0076831    .0314708
           2004  |   .0085283   .0059876     1.42   0.155     -.003217    .0202736
           2005  |  -.0092883   .0069266    -1.34   0.180    -.0228755    .0042989
           2006  |  -.0155196   .0076394    -2.03   0.042    -.0305051   -.0005342
           2007  |  -.0187548   .0091819    -2.04   0.041     -.036766   -.0007436
           2008  |  -.0157858   .0096391    -1.64   0.102    -.0346938    .0031223
           2009  |  -.0136269   .0101453    -1.34   0.179    -.0335281    .0062742
           2010  |  -.0205308    .008291    -2.48   0.013    -.0367946    -.004267
           2011  |  -.0204554   .0085807    -2.38   0.017    -.0372875   -.0036234
                 |
           _cons |   .1313362   .0116397    11.28   0.000     .1085038    .1541686
    -------------+----------------------------------------------------------------
             sic |   absorbed                                      (19 categories)
    Last edited by Roman Neuenschwander; 11 Feb 2021, 10:22.

  • #2
    Roman:
    I would cluster the standard errors on -gwkey-.
    That said, I fail to get why you went -areg- insteaf of -xtreg,fe-.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Yes indeed, it is a question that crops up a lot here. And it seems that experts disagree.

      You can see the discussion at this thread: https://www.statalist.org/forums/for...tandard-errors

      Comment


      • #4
        In my opinion,you should cluster at the industry level, because the variable "penetration" is at the industry level,so you should at least cluster at the industry level.

        best
        Raymond
        Best regards.

        Raymond Zhang
        Stata 17.0,MP

        Comment


        • #5
          As Joro remind us about, this a frequent and intersting question.
          My previous reply was mainly driven by noticing that -gvkey- has a number of clusters that is time higher the one of -industry- (as it had to be expected given the dataset structure).
          I would be more confident on the resulting non-default standard errors when based upon such a large number of clusters.
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            Originally posted by Carlo Lazzaro View Post
            Roman:
            I would cluster the standard errors on -gwkey-.
            That said, I fail to get why you went -areg- insteaf of -xtreg,fe-.
            Dear Carlo, thank you very much for your reply. You said that you would use xtreg, fe instead of areg. The thing is: I actually don't know how to implement industry fixed effects with xtreg. If I use xtreg, fe I get the following result. I guess sic are omitted because "SIC FE" are already present in "GVEY FE"? How can I use xtreg, fe if I only want to control for SIC FE and YEAR FE?

            Code:
             xtreg share_zombiesBH2 L3.penetration L3.ln_at L3.age L3.F_E L3.tnic3hhi L3.dtfp4 i.year i.sic, fe vce(cluster gvkey)
            note: 2011.year omitted because of collinearity
            note: 21.sic omitted because of collinearity
            note: 22.sic omitted because of collinearity
            note: 23.sic omitted because of collinearity
            note: 24.sic omitted because of collinearity
            note: 25.sic omitted because of collinearity
            note: 26.sic omitted because of collinearity
            note: 27.sic omitted because of collinearity
            note: 28.sic omitted because of collinearity
            note: 29.sic omitted because of collinearity
            note: 30.sic omitted because of collinearity
            note: 31.sic omitted because of collinearity
            note: 32.sic omitted because of collinearity
            note: 33.sic omitted because of collinearity
            note: 34.sic omitted because of collinearity
            note: 35.sic omitted because of collinearity
            note: 36.sic omitted because of collinearity
            note: 37.sic omitted because of collinearity
            note: 38.sic omitted because of collinearity
            
            Fixed-effects (within) regression               Number of obs     =     10,760
            Group variable: gvkey                           Number of groups  =      1,446
            
            R-sq:                                           Obs per group:
                 within  = 0.2926                                         min =          1
                 between = 0.2426                                         avg =        7.4
                 overall = 0.2221                                         max =         20
            
                                                            F(24,1445)        =     160.59
            corr(u_i, Xb)  = -0.4160                        Prob > F          =     0.0000
            
                                          (Std. Err. adjusted for 1,446 clusters in gvkey)
            ------------------------------------------------------------------------------
                         |               Robust
            share_zomb~2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
             penetration |
                     L3. |   .3091428   .0962276     3.21   0.001     .1203822    .4979035
                         |
                   ln_at |
                     L3. |   .0023706   .0013811     1.72   0.086    -.0003385    .0050797
                         |
                     age |
                     L3. |  -.0012302   .0005268    -2.34   0.020    -.0022635   -.0001968
                         |
                     F_E |
                     L3. |  -.0000622   .0000685    -0.91   0.364    -.0001965    .0000721
                         |
                tnic3hhi |
                     L3. |  -.0015206   .0026089    -0.58   0.560    -.0066383    .0035971
                         |
                   dtfp4 |
                     L3. |  -.0055812   .0032852    -1.70   0.090    -.0120255     .000863
                         |
                    year |
                   1993  |  -.0070924    .001049    -6.76   0.000    -.0091502   -.0050347
                   1994  |  -.0174078    .001324   -13.15   0.000     -.020005   -.0148105
                   1995  |  -.0118363   .0014896    -7.95   0.000    -.0147583   -.0089143
                   1996  |  -.0302189   .0017185   -17.58   0.000      -.03359   -.0268478
                   1997  |  -.0287877   .0019339   -14.89   0.000    -.0325812   -.0249942
                   1998  |  -.0244304   .0020998   -11.63   0.000    -.0285493   -.0203115
                   1999  |  -.0106844   .0022496    -4.75   0.000    -.0150971   -.0062716
                   2000  |  -.0162359   .0021475    -7.56   0.000    -.0204484   -.0120234
                   2001  |  -.0003507   .0025951    -0.14   0.893    -.0054412    .0047397
                   2002  |   .0278744    .002579    10.81   0.000     .0228153    .0329334
                   2003  |    .028916   .0028754    10.06   0.000     .0232756    .0345563
                   2004  |   .0195569   .0027003     7.24   0.000       .01426    .0248538
                   2005  |   .0027552   .0025479     1.08   0.280    -.0022429    .0077532
                   2006  |  -.0028305   .0027571    -1.03   0.305    -.0082388    .0025778
                   2007  |  -.0049363   .0038147    -1.29   0.196    -.0124191    .0025466
                   2008  |  -.0000441   .0036302    -0.01   0.990    -.0071651     .007077
                   2009  |   .0038639   .0035811     1.08   0.281    -.0031608    .0108887
                   2010  |  -.0015209   .0015341    -0.99   0.322    -.0045301    .0014883
                   2011  |          0  (omitted)
                         |
                     sic |
                     21  |          0  (omitted)
                     22  |          0  (omitted)
                     23  |          0  (omitted)
                     24  |          0  (omitted)
                     25  |          0  (omitted)
                     26  |          0  (omitted)
                     27  |          0  (omitted)
                     28  |          0  (omitted)
                     29  |          0  (omitted)
                     30  |          0  (omitted)
                     31  |          0  (omitted)
                     32  |          0  (omitted)
                     33  |          0  (omitted)
                     34  |          0  (omitted)
                     35  |          0  (omitted)
                     36  |          0  (omitted)
                     37  |          0  (omitted)
                     38  |          0  (omitted)
                         |
                   _cons |   .1284986   .0109015    11.79   0.000     .1071142     .149883
            -------------+----------------------------------------------------------------
                 sigma_u |  .03394698
                 sigma_e |  .02912656
                     rho |  .57598162   (fraction of variance due to u_i)
            ------------------------------------------------------------------------------
            I also did use reg and reghdfe which seems to me more appropriate in my case. Which regression model would recommend? I am sorry for my basic questions but I am still struggling in using the most appropriate model. I also read carefully the slides of Oscar Torres-Reyna https://www.princeton.edu/~otorres/Panel101.pdf which brings me to the point that is does noet really matter which model I use?


            Code:
              
            reg share_zombiesBH2 L3.penetration L3.ln_at L3.age L3.F_E L3.tnic3hhi L3.dtfp4 i.sic i.year, vce(cluster gvkey)
            
            Linear regression                               Number of obs     =     10,760
                                                            F(43, 1445)       =     481.80
                                                            Prob > F          =     0.0000
                                                            R-squared         =     0.5429
                                                            Root MSE          =     .03061
            
                                          (Std. Err. adjusted for 1,446 clusters in gvkey)
            ------------------------------------------------------------------------------
                         |               Robust
            share_zomb~2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
             penetration |
                     L3. |   .3198444   .0820098     3.90   0.000     .1589734    .4807154
                         |
                   ln_at |
                     L3. |   .0006393   .0002082     3.07   0.002     .0002308    .0010478
                         |
                     age |
                     L3. |  -.0002218   .0000835    -2.66   0.008    -.0003855   -.0000581
                         |
                     F_E |
                     L3. |   .0000837   .0000641     1.31   0.192    -.0000421    .0002094
                         |
                tnic3hhi |
                     L3. |  -.0023026   .0013611    -1.69   0.091    -.0049725    .0003673
                         |
                   dtfp4 |
                     L3. |  -.0112122   .0032899    -3.41   0.001    -.0176656   -.0047588
                         |
                     sic |
                     21  |  -.1256625   .0062298   -20.17   0.000     -.137883   -.1134421
                     22  |   -.041868   .0087362    -4.79   0.000     -.059005   -.0247311
                     23  |  -.0664815   .0297876    -2.23   0.026     -.124913     -.00805
                     24  |   .0933224   .0122431     7.62   0.000     .0693062    .1173386
                     25  |  -.0637759   .0093951    -6.79   0.000    -.0822053   -.0453464
                     26  |   .0781869   .0039424    19.83   0.000     .0704535    .0859203
                     27  |  -.0279051   .0041315    -6.75   0.000    -.0360093   -.0198008
                     28  |   -.000757    .006439    -0.12   0.906    -.0133878    .0118738
                     29  |    .063047   .0079639     7.92   0.000     .0474249     .078669
                     30  |   .0012142   .0054826     0.22   0.825    -.0095406     .011969
                     31  |  -.1109139   .0597486    -1.86   0.064    -.2281171    .0062893
                     32  |   .0549293   .0171755     3.20   0.001     .0212377    .0886208
                     33  |    .023575   .0128381     1.84   0.067    -.0016083    .0487583
                     34  |   .0025715   .0038924     0.66   0.509    -.0050638    .0102068
                     35  |  -.0234893   .0156814    -1.50   0.134      -.05425    .0072714
                     36  |  -.0256094   .0204221    -1.25   0.210    -.0656696    .0144508
                     37  |  -.0174381   .0150086    -1.16   0.245    -.0468791    .0120028
                     38  |   .0175823   .0100688     1.75   0.081    -.0021688    .0373333
                         |
                    year |
                   1993  |  -.0084138   .0010055    -8.37   0.000    -.0103862   -.0064414
                   1994  |  -.0195333   .0010636   -18.37   0.000    -.0216197    -.017447
                   1995  |  -.0151116   .0010935   -13.82   0.000    -.0172566   -.0129666
                   1996  |  -.0346192   .0016817   -20.59   0.000     -.037918   -.0313204
                   1997  |  -.0340963   .0023852   -14.30   0.000    -.0387751   -.0294176
                   1998  |  -.0305902   .0030292   -10.10   0.000    -.0365323    -.024648
                   1999  |   -.017059    .003114    -5.48   0.000    -.0231674   -.0109506
                   2000  |  -.0232423   .0033798    -6.88   0.000     -.029872   -.0166125
                   2001  |  -.0088355   .0038973    -2.27   0.024    -.0164805   -.0011906
                   2002  |   .0187259   .0050066     3.74   0.000     .0089049    .0285469
                   2003  |    .019577   .0060633     3.23   0.001     .0076831    .0314708
                   2004  |   .0085283   .0059876     1.42   0.155     -.003217    .0202736
                   2005  |  -.0092883   .0069266    -1.34   0.180    -.0228755    .0042989
                   2006  |  -.0155196   .0076394    -2.03   0.042    -.0305051   -.0005342
                   2007  |  -.0187548   .0091819    -2.04   0.041     -.036766   -.0007436
                   2008  |  -.0157858   .0096391    -1.64   0.102    -.0346938    .0031223
                   2009  |  -.0136269   .0101453    -1.34   0.179    -.0335281    .0062742
                   2010  |  -.0205308    .008291    -2.48   0.013    -.0367946    -.004267
                   2011  |  -.0204554   .0085807    -2.38   0.017    -.0372875   -.0036234
                         |
                   _cons |   .1364423   .0022234    61.37   0.000     .1320809    .1408038
            ------------------------------------------------------------------------------
            Code:
              
            reghdfe share_zombiesBH2 L3.penetration L3.ln_at L3.age L3.F_E L3.tnic3hhi L3.dtfp4, absorb(year sic) vce(cluster gvkey)
            (MWFE estimator converged in 5 iterations)
            
            HDFE Linear regression                            Number of obs   =     10,760
            Absorbing 2 HDFE groups                           F(   6,   1445) =      11.92
            Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                              R-squared       =     0.5429
                                                              Adj R-squared   =     0.5411
                                                              Within R-sq.    =     0.0466
            Number of clusters (gvkey)   =      1,446         Root MSE        =     0.0306
            
                                          (Std. Err. adjusted for 1,446 clusters in gvkey)
            ------------------------------------------------------------------------------
                         |               Robust
            share_zomb~2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
             penetration |
                     L3. |   .3198444   .0820098     3.90   0.000     .1589734    .4807154
                         |
                   ln_at |
                     L3. |   .0006393   .0002082     3.07   0.002     .0002308    .0010478
                         |
                     age |
                     L3. |  -.0002218   .0000835    -2.66   0.008    -.0003855   -.0000581
                         |
                     F_E |
                     L3. |   .0000837   .0000641     1.31   0.192    -.0000421    .0002094
                         |
                tnic3hhi |
                     L3. |  -.0023026   .0013611    -1.69   0.091    -.0049725    .0003673
                         |
                   dtfp4 |
                     L3. |  -.0112122   .0032899    -3.41   0.001    -.0176656   -.0047588
                         |
                   _cons |   .1182142   .0155904     7.58   0.000     .0876319    .1487966
            ------------------------------------------------------------------------------
            
            Absorbed degrees of freedom:
            -----------------------------------------------------+
             Absorbed FE | Categories  - Redundant  = Num. Coefs |
            -------------+---------------------------------------|
                    year |        20           0          20     |
                     sic |        19           1          18     |
            -----------------------------------------------------+
            Last edited by Roman Neuenschwander; 12 Feb 2021, 02:47.

            Comment


            • #7
              Roman:
              the community-contributed module -reghdfe- seems the way to go here.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Thank you for your support Carlos

                Comment


                • #9
                  As a general guide about clustering standard errors, you may want to have a look at Cameron and Miller (2015) paper that you may find here

                  Comment

                  Working...
                  X