Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Clustering Standard Error

    I am working on finding the relationship between FDI inflow and military expenditure. I have panel data of 62 developing countries from 1990 to 2018. These countries are divided into five geographical regions. I was wondering if I could cluster my errors to the geographical regions instead of clustering by country. Wikipedia article mentions that 30-50 clusters are preferred but I could not see a particular reason behind this.

  • #2
    Prashant, you may cluster at the region level, but it's not necessary and may be too conservative. Clustering at country level would be standard for your case.

    Comment


    • #3
      In addition to Fey's informative response, here is a veryuseful article you may want to read:

      https://www.nber.org/papers/w24003

      Comment


      • #4
        Prahsant:
        in addition to previous helpful guidance, if you have in mind to go -fe-, you may want to consider the community-contributed module -reghdfe- (just type -search reghdfe- from within Stata to spot and install it) that allows multi-way clustering (vs. -xtreg,fe-), as detailed in the folllowing toy-example:
        Code:
        . use "https://www.stata-press.com/data/r16/nlswork.dta"
        (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
        
        . xtreg ln_wage age, fe vce(cluster race)
        
        Fixed-effects (within) regression               Number of obs      =     28510
        Group variable: idcode                          Number of groups   =      4710
        
        R-sq:  within  = 0.1026                         Obs per group: min =         1
               between = 0.0877                                        avg =       6.1
               overall = 0.0774                                        max =        15
        
                                                        F(1,2)             =    761.45
        corr(u_i, Xb)  = 0.0314                         Prob > F           =    0.0013
        
                                           (Std. Err. adjusted for 3 clusters in race)
        ------------------------------------------------------------------------------
                     |               Robust
             ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                 age |   .0181349   .0006572    27.59   0.001     .0153072    .0209626
               _cons |   1.148214   .0190883    60.15   0.000     1.066083    1.230344
        -------------+----------------------------------------------------------------
             sigma_u |  .40635023
             sigma_e |  .30349389
                 rho |  .64192015   (fraction of variance due to u_i)
        ------------------------------------------------------------------------------
        
        . reghdfe ln_wage age, abs(idcode) vce(cluster race birth_yr)
        (dropped 551 singleton observations)
        (converged in 1 iterations)
        
        HDFE Linear regression                            Number of obs   =     27,959
        Absorbing 1 HDFE group                            F(   1,      2) =     325.59
        Statistics robust to heteroskedasticity           Prob > F        =     0.0031
                                                          R-squared       =     0.6540
                                                          Adj R-squared   =     0.5936
        Number of clusters (race)    =          3         Within R-sq.    =     0.1026
        Number of clusters (birth_yr) =         14        Root MSE        =     0.3035
        
                                  (Std. Err. adjusted for 3 clusters in race birth_yr)
        ------------------------------------------------------------------------------
                     |               Robust
             ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                 age |   .0181349    .001005    18.04   0.003     .0138106    .0224592
        ------------------------------------------------------------------------------
        
        Absorbed degrees of freedom:
        ---------------------------------------------------------------+
         Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     |
        -------------+-------------------------------------------------|
              idcode |            0            4159           4159 *   |
        ---------------------------------------------------------------+
        * = fixed effect nested within cluster; treated as redundant for DoF computation
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Thank you so much for all the helpful responses!

          Comment

          Working...
          X