Testing Fixed Effects Model with Clustered Standard Errors

Ethan OLeary

Join Date: Oct 2021

Posts: 4
#1

Testing Fixed Effects Model with Clustered Standard Errors

13 Oct 2021, 10:08

Dear Forum,

I am currently working with a fixed effects linear probability model with T = 4 and N = 89. The model employs clustered standard errors (approx 30 clusters) and as such when I run -xtreg <vars>, fe vce(cluster <cluster var>)-, STATA does not provide me with an F statistic to test for poolability.

I have searched for a while now for a possible solution here but have not found a suitable fix. I have also already used an F test to show the presence of fixed effects but I am equally not sure whether this is appropriate. I am under the impression that since I am using clustered standard errors, a regular F test cannot be performed here. Is there a solution to this?

Many thanks
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#2

13 Oct 2021, 10:26

Ethan:
welcome to this forum.
Actually the F-test on the evidence (or not) of panel-wise effect is not reported with non-default standard errors.
That said, and provided that I'm not aware on any fix:
1) you can check the correlation between the fixed effect and the vector of regressors to have an idea of the presence of a panel-wise effect;
2) you can take a look at the magnitude of -rho- value for the very same purpose.

That said, and a bit off-topic here, you clustered yuor standard errors on a variable taht differ from N. This is a bit unusual: is it what you want?

Kind regards,
Carlo
(Stata 19.0)
Comment
Ethan OLeary

Join Date: Oct 2021

Posts: 4
#3

13 Oct 2021, 11:22

Hi Carlo,

Thank you for the welcome and speedy response to my query.

If I understand the intuition, I am looking at the variation in the regressors that is caused by the fixed effects. The higher this, the more I should consider switching to a non-panel model?

Is there a critical value for the -rho- that you consider to be the point at which we ring alarm bells?

In regards to your clustering question: I am aware it is unusual. The data is from a game show and so clusters now are by season of the game show. I think it could make sense that since every season is no identical in its repetition and interactions between subjects are contained within each season, to cluster at the season level. Although, now that you have brought it up, perhaps I should cluster at the individual level since I recall from my Econometrics that I should be clustering at the lowest group. Thank you for bringing this up since I may have been caught out at my thesis defence!

Kind regards,
Ethan
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17712

14 Oct 2021, 01:00

Ethan:
1) you're investigating whether there's evidence of a within panel-wise effect in your dataset. Hence, taking a look at the -within R-sq- value provided by Stata is a good habit (unfortunately, there's no a hard and fast rule to define a cut-off/threshold value).
2) in addition to wiping out time-invariant predictors and getting rid of unobserved time-invariant heterogeneity, -fe- estimator allows a weak endogeneity (ie, the -u- error component is correlated with the vector of predictors). Hence, taking a look at -u- correlation value reported by Stata is a good habit (unfortunately again, there's no a hard and fast rule to define a cut-off/threshold value).
3) if you're interested in two-way Clustering with -fe-, you may want to consider the community-contributed module -reghdfe-, an application of which is reported in the following toy-example:

Code:

use "https://www.stata-press.com/data/r16/nlswork.dta"
. reghdfe ln_wage c.age##c.age, abs(idcode) vce(cluster idcode race)
(dropped 551 singleton observations)
(converged in 1 iterations)

HDFE Linear regression                            Number of obs   =     27,959
Absorbing 1 HDFE group                            F(   2,      2) =    4669.23
Statistics robust to heteroskedasticity           Prob > F        =     0.0002
                                                  R-squared       =     0.6564
                                                  Adj R-squared   =     0.5963
Number of clusters (idcode)  =      4,159         Within R-sq.    =     0.1087
Number of clusters (race)    =          3         Root MSE        =     0.3025

                            (Std. Err. adjusted for 3 clusters in idcode race)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0539076   .0068973     7.82   0.016     .0242309    .0835844
             |
 c.age#c.age |  -.0005973   .0001058    -5.65   0.030    -.0010525   -.0001421
------------------------------------------------------------------------------

Absorbed degrees of freedom:
---------------------------------------------------------------+
 Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     |
-------------+-------------------------------------------------|
      idcode |            0            4159           4159 *   |
---------------------------------------------------------------+
* = fixed effect nested within cluster; treated as redundant for DoF computation

.

Kind regards,
Carlo
(Stata 19.0)

Comment

Ethan OLeary

Join Date: Oct 2021

Posts: 4
#5

19 Oct 2021, 06:24

Carlo,
I really appreciate your responses and sorry it took so long for me to express my gratitude. This has helped me immensely. I wish you a pleasant day.
Best
Ethan
Comment

Announcement