Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • What is the correct way to use double-clustering in reghdfe?

    Hi all,

    I have a panel data set and I believe that correlations appear in two dimensions with both firm and time level. What is the correct way to use double-clustering in reghdfe?

    These two give me different results:

    Code:
    reghdfe y1 x1 x2, absorb(year) cluster(companyid year)
    and

    Code:
    reghdfe y1 x1 x2, absorb(year) cluster(companyid#year)
    A paper I've been following states "clustering by firm and year to account for correlations among error terms within firm and within time in the analyses"

    Which one should I use?

    Many thanks.

  • #2
    You didn't get a quick answer. You'll increase your chances of a useful answer by following the FAQ on asking questions. For example, I don't know how many observations you have per firm year.

    The documentation for reghdfe notes that "each cluster variable must have at least 50 different categories..." I suspect you're better off with companyid and year rather than companyid#year - do you really have more than one observation per company year? It seems to me to not make sense to have a cluster that is at the observation level. This is some work that claims larger clusters are better because the estimates are robust to within-cluster heteroskedasticity not cross-cluster heteroskedasticity.

    Comment

    Working...
    X