Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Testing Fixed Effects Model with Clustered Standard Errors

    Dear Forum,

    I am currently working with a fixed effects linear probability model with T = 4 and N = 89. The model employs clustered standard errors (approx 30 clusters) and as such when I run -xtreg <vars>, fe vce(cluster <cluster var>)-, STATA does not provide me with an F statistic to test for poolability.

    I have searched for a while now for a possible solution here but have not found a suitable fix. I have also already used an F test to show the presence of fixed effects but I am equally not sure whether this is appropriate. I am under the impression that since I am using clustered standard errors, a regular F test cannot be performed here. Is there a solution to this?


    Many thanks

  • #2
    Ethan:
    welcome to this forum.
    Actually the F-test on the evidence (or not) of panel-wise effect is not reported with non-default standard errors.
    That said, and provided that I'm not aware on any fix:
    1) you can check the correlation between the fixed effect and the vector of regressors to have an idea of the presence of a panel-wise effect;
    2) you can take a look at the magnitude of -rho- value for the very same purpose.

    That said, and a bit off-topic here, you clustered yuor standard errors on a variable taht differ from N. This is a bit unusual: is it what you want?
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hi Carlo,

      Thank you for the welcome and speedy response to my query.

      If I understand the intuition, I am looking at the variation in the regressors that is caused by the fixed effects. The higher this, the more I should consider switching to a non-panel model?

      Is there a critical value for the -rho- that you consider to be the point at which we ring alarm bells?

      In regards to your clustering question: I am aware it is unusual. The data is from a game show and so clusters now are by season of the game show. I think it could make sense that since every season is no identical in its repetition and interactions between subjects are contained within each season, to cluster at the season level. Although, now that you have brought it up, perhaps I should cluster at the individual level since I recall from my Econometrics that I should be clustering at the lowest group. Thank you for bringing this up since I may have been caught out at my thesis defence!

      Kind regards,
      Ethan

      Comment


      • #4
        Ethan:
        1) you're investigating whether there's evidence of a within panel-wise effect in your dataset. Hence, taking a look at the -within R-sq- value provided by Stata is a good habit (unfortunately, there's no a hard and fast rule to define a cut-off/threshold value).
        2) in addition to wiping out time-invariant predictors and getting rid of unobserved time-invariant heterogeneity, -fe- estimator allows a weak endogeneity (ie, the -u- error component is correlated with the vector of predictors). Hence, taking a look at -u- correlation value reported by Stata is a good habit (unfortunately again, there's no a hard and fast rule to define a cut-off/threshold value).
        3) if you're interested in two-way Clustering with -fe-, you may want to consider the community-contributed module -reghdfe-, an application of which is reported in the following toy-example:
        Code:
        use "https://www.stata-press.com/data/r16/nlswork.dta"
        . reghdfe ln_wage c.age##c.age, abs(idcode) vce(cluster idcode race)
        (dropped 551 singleton observations)
        (converged in 1 iterations)
        
        HDFE Linear regression                            Number of obs   =     27,959
        Absorbing 1 HDFE group                            F(   2,      2) =    4669.23
        Statistics robust to heteroskedasticity           Prob > F        =     0.0002
                                                          R-squared       =     0.6564
                                                          Adj R-squared   =     0.5963
        Number of clusters (idcode)  =      4,159         Within R-sq.    =     0.1087
        Number of clusters (race)    =          3         Root MSE        =     0.3025
        
                                    (Std. Err. adjusted for 3 clusters in idcode race)
        ------------------------------------------------------------------------------
                     |               Robust
             ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                 age |   .0539076   .0068973     7.82   0.016     .0242309    .0835844
                     |
         c.age#c.age |  -.0005973   .0001058    -5.65   0.030    -.0010525   -.0001421
        ------------------------------------------------------------------------------
        
        Absorbed degrees of freedom:
        ---------------------------------------------------------------+
         Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     |
        -------------+-------------------------------------------------|
              idcode |            0            4159           4159 *   |
        ---------------------------------------------------------------+
        * = fixed effect nested within cluster; treated as redundant for DoF computation
        
        .
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Carlo,
          I really appreciate your responses and sorry it took so long for me to express my gratitude. This has helped me immensely. I wish you a pleasant day.
          Best
          Ethan

          Comment

          Working...
          X