Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Level of significance different with reghdfe

    Dear all,

    I have a panel dataset organised by cell and quarter. My main regression is the following
    xi: reghdfe y x1 x2 x3, a(cell quarter) vce(cluster cell quarter)

    I want to include an additional fixed effect, namely a postcode fixed effect. Therefore, I did
    xi: reghdfe y x1 x2 x3, a(cell postcode quarter) vce(cluster cell quarter)
    For comparison, I also performed the following:
    xi: reghdfe y x1 x2 x3 i.postcode, a(cell quarter) vce(cluster cell quarter).
    I get the same coefficient, the same number of observations and R-squared. However, I got a different level of significance.

    I was not expecting this. What could explain this difference in the level of significance and which one is the more correct approach?

    Thank you!

  • #2
    reghdfe is from SSC, authored by Sergio Correia (FAQ Advice #12).

    xi: reghdfe y x1 x2 x3, a(cell postcode quarter) vce(cluster cell quarter)
    xi: reghdfe y x1 x2 x3 i.postcode, a(cell quarter) vce(cluster cell quarter)
    The -xi- prefix is superfluous here. Do not include it, reghdfe supports factor variables.

    I get the same coefficient, the same number of observations and R-squared. However, I got a different level of significance.
    There are two issues here, the total degrees of freedom and the adjustment applied to the degrees of freedom by the estimator. By including indicators for the fixed effects instead of absorbing them, you are altering the model's degrees of freedom. So the regressions are based on different degrees of freedom. For the second issue, look at -help reghdfe- for more on how the degrees of freedom are adjusted. You will find the same phenomenon if you compare fixed effects model estimates from areg and xtreg. Here is a relevant paragraph from reghdfe's documentation:


    clusters will check if a fixed effect is nested within a clustervar. In that case,
    it will set e(K#)==e(M#) and no degrees-of-freedom will be lost due to this fixed
    effect. The rationale is that we are already assuming that the number of effective
    observations is the number of cluster levels. This is the same adjustment that
    xtreg, fe does, but areg does not use it.

    which one is the more correct approach?
    Both, as the estimation method affects the resulting degrees of freedom. In any case, differences should not be material.
    Last edited by Andrew Musau; 24 Mar 2021, 11:53.

    Comment

    Working...
    X