Clustering Standard Error

Prashant Bhandari

Join Date: Apr 2021

Posts: 11
#1

Clustering Standard Error

02 Nov 2021, 21:06

I am working on finding the relationship between FDI inflow and military expenditure. I have panel data of 62 developing countries from 1990 to 2018. These countries are divided into five geographical regions. I was wondering if I could cluster my errors to the geographical regions instead of clustering by country. Wikipedia article mentions that 30-50 clusters are preferred but I could not see a particular reason behind this.
Tags: None
Fei Wang

Join Date: Oct 2021

Posts: 726
#2

02 Nov 2021, 23:57

Prashant, you may cluster at the region level, but it's not necessary and may be too conservative. Clustering at country level would be standard for your case.
Comment
Maxence Morlet

Join Date: Mar 2021

Posts: 653
#3

03 Nov 2021, 01:01

In addition to Fey's informative response, here is a veryuseful article you may want to read:

https://www.nber.org/papers/w24003
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17711

03 Nov 2021, 09:32

Prahsant:
in addition to previous helpful guidance, if you have in mind to go -fe-, you may want to consider the community-contributed module -reghdfe- (just type -search reghdfe- from within Stata to spot and install it) that allows multi-way clustering (vs. -xtreg,fe-), as detailed in the folllowing toy-example:

Code:

. use "https://www.stata-press.com/data/r16/nlswork.dta"
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

. xtreg ln_wage age, fe vce(cluster race)

Fixed-effects (within) regression               Number of obs      =     28510
Group variable: idcode                          Number of groups   =      4710

R-sq:  within  = 0.1026                         Obs per group: min =         1
       between = 0.0877                                        avg =       6.1
       overall = 0.0774                                        max =        15

                                                F(1,2)             =    761.45
corr(u_i, Xb)  = 0.0314                         Prob > F           =    0.0013

                                   (Std. Err. adjusted for 3 clusters in race)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0181349   .0006572    27.59   0.001     .0153072    .0209626
       _cons |   1.148214   .0190883    60.15   0.000     1.066083    1.230344
-------------+----------------------------------------------------------------
     sigma_u |  .40635023
     sigma_e |  .30349389
         rho |  .64192015   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. reghdfe ln_wage age, abs(idcode) vce(cluster race birth_yr)
(dropped 551 singleton observations)
(converged in 1 iterations)

HDFE Linear regression                            Number of obs   =     27,959
Absorbing 1 HDFE group                            F(   1,      2) =     325.59
Statistics robust to heteroskedasticity           Prob > F        =     0.0031
                                                  R-squared       =     0.6540
                                                  Adj R-squared   =     0.5936
Number of clusters (race)    =          3         Within R-sq.    =     0.1026
Number of clusters (birth_yr) =         14        Root MSE        =     0.3035

                          (Std. Err. adjusted for 3 clusters in race birth_yr)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0181349    .001005    18.04   0.003     .0138106    .0224592
------------------------------------------------------------------------------

Absorbed degrees of freedom:
---------------------------------------------------------------+
 Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     |
-------------+-------------------------------------------------|
      idcode |            0            4159           4159 *   |
---------------------------------------------------------------+
* = fixed effect nested within cluster; treated as redundant for DoF computation

Kind regards,
Carlo
(Stata 19.0)

Comment

Prashant Bhandari

Join Date: Apr 2021

Posts: 11
#5

04 Nov 2021, 10:37

Thank you so much for all the helpful responses!
Comment

Announcement

Clustering Standard Error

Comment

Comment

Comment

Comment