Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • A conceptual question about when should I add fixed effect and cluster the fixed effect?

    Suppose I add a time-fixed effect to a panel data regression that I want to estimate using OLS. My question is conceptually when should I also cluster by time (in addition to adding fixed effects)?

  • #2
    John:
    let's assume that you are dealing with a short panel (N>T) and want to go -fe- via OLS (BTW: this appoach is outperformed by -xtreg,fe-).
    In that case you should cluster on -panelid- (in addition to adding fixed effect) as the epsilon term might be correlated within the observations belonging to the same panel.
    Clustering on -timevar- only is not recommended (as you're mainly interested in -panelid-), whereas you might be willing to cluster your standard errors on both N and T dimensions (let's assume that a give "shock" is expected to hit panels differently across time). You can get this double clustering via the community-contributed module -reghdfe-:
    Code:
    use https://www.stata-press.com/data/r17/nlswork.dta
    xtreg ln_wage age i.year, fe vce(cluster idcode)
    . xtreg ln_wage age i.year, fe vce(cluster idcode)
    
    Fixed-effects (within) regression               Number of obs     =     28,510
    Group variable: idcode                          Number of groups  =      4,710
    
    R-squared:                                      Obs per group:
         Within  = 0.1060                                         min =          1
         Between = 0.0914                                         avg =        6.1
         Overall = 0.0805                                         max =         15
    
                                                    F(15,4709)        =      69.49
    corr(u_i, Xb) = 0.0467                          Prob > F          =     0.0000
    
                                 (Std. err. adjusted for 4,710 clusters in idcode)
    ------------------------------------------------------------------------------
                 |               Robust
         ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             age |   .0125992   .0123091     1.02   0.306    -.0115323    .0367308
                 |
            year |
             69  |   .0748621   .0156425     4.79   0.000     .0441955    .1055287
             70  |   .0478697   .0265729     1.80   0.072    -.0042256    .0999649
             71  |   .0865577   .0385328     2.25   0.025     .0110155       .1621
             72  |   .0856757   .0505004     1.70   0.090    -.0133288    .1846802
             73  |   .0880069   .0626993     1.40   0.160    -.0349132    .2109269
             75  |   .0778607   .0865126     0.90   0.368    -.0917446     .247466
             77  |    .108365   .1111117     0.98   0.329    -.1094659    .3261959
             78  |   .1309518   .1237306     1.06   0.290    -.1116181    .3735217
             80  |   .1142649   .1480678     0.77   0.440    -.1760172    .4045471
             82  |   .1090451   .1724619     0.63   0.527    -.2290608    .4471511
             83  |   .1211272   .1846402     0.66   0.512    -.2408539    .4831083
             85  |   .1465637   .2092454     0.70   0.484    -.2636552    .5567825
             87  |   .1382642   .2341219     0.59   0.555    -.3207242    .5972527
             88  |   .1799741   .2500607     0.72   0.472    -.3102618      .67021
                 |
           _cons |   1.203731    .235213     5.12   0.000     .7426037    1.664859
    -------------+----------------------------------------------------------------
         sigma_u |   .4058746
         sigma_e |  .30300411
             rho |  .64212421   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------
    
    reghdfe ln_w age i.year, absorb(idcode) vce(cluster idcode year)
    . reghdfe ln_w age i.year, absorb(idcode) vce(cluster idcode year)
    (dropped 551 singleton observations)
    (MWFE estimator converged in 1 iterations)
    Warning: VCV matrix was non-positive semi-definite; adjustment from Cameron, Gelbach & Miller applied.
    warning: missing F statistic; dropped variables due to collinearity or too few clusters
    
    HDFE Linear regression                            Number of obs   =     27,959
    Absorbing 1 HDFE group                            F(  15,     14) =          .
    Statistics robust to heteroskedasticity           Prob > F        =          .
                                                      R-squared       =     0.6553
                                                      Adj R-squared   =     0.5949
    Number of clusters (idcode)  =      4,159         Within R-sq.    =     0.1060
    Number of clusters (year)    =         15         Root MSE        =     0.3030
    
                               (Std. err. adjusted for 15 clusters in idcode year)
    ------------------------------------------------------------------------------
                 |               Robust
         ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
             age |   .0125992   .0109367     1.15   0.269    -.0108576    .0360561
                 |
            year |
             69  |   .0748621   .0107168     6.99   0.000     .0518768    .0978474
             70  |   .0478697   .0232197     2.06   0.058    -.0019315    .0976709
             71  |   .0865577   .0354149     2.44   0.028     .0106003    .1625152
             72  |   .0856757   .0475462     1.80   0.093    -.0163007    .1876521
             73  |   .0880069   .0600766     1.46   0.165    -.0408447    .2168584
             75  |   .0778607   .0818643     0.95   0.358    -.0977207    .2534421
             77  |    .108365   .1034378     1.05   0.313    -.1134871    .3302171
             78  |   .1309518   .1155376     1.13   0.276    -.1168516    .3787553
             80  |   .1142649   .1367118     0.84   0.417    -.1789528    .4074827
             82  |   .1090451   .1562581     0.70   0.497    -.2260953    .4441855
             83  |   .1211272   .1662107     0.73   0.478    -.2353592    .4776136
             85  |   .1465637   .1877183     0.78   0.448    -.2560521    .5491794
             87  |   .1382642   .2093267     0.66   0.520    -.3106968    .5872253
             88  |   .1799741   .2237412     0.80   0.435     -.299903    .6598513
                 |
           _cons |   1.205651   .2071379     5.82   0.000     .7613846    1.649918
    ------------------------------------------------------------------------------
    
    Absorbed degrees of freedom:
    -----------------------------------------------------+
     Absorbed FE | Categories  - Redundant  = Num. Coefs |
    -------------+---------------------------------------|
          idcode |      4159        4159           0    *|
    -----------------------------------------------------+
    * = FE nested within cluster; treated as redundant for DoF computation
    
    .
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      1)I think there was a misunderstanding about time and firm fixed effects. My context is asset pricing. Each observation is a country and LHS is the return of stock index in that country in a given quarter and RHS consists of some macro factors like GDP growth. So my question was whether it is necessary to cluster by time when we already have added time fixed effect (There 40=N<T=300).

      Comment


      • #4
        John:
        for T>N panel datasets with -panelid- fixed effect, see -xtregar,fe- that does not support clustered standard errors.
        Last edited by Carlo Lazzaro; 31 Oct 2022, 11:27.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment

        Working...
        X