Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Clustered standard errors

    I am doubting whether to use -, cluster(sector)- or just -, robust-.
    Now I read somewhere on this forum, and I really cannot find it again, that clusters are less valuable when there are a lot of clusters and a low amount of observations within each cluster.
    Is this correct?
    I have 78 observations and would have 15 clusters.

    Thanks

  • #2
    I think you have it backwards. Cluster robust variance estimators are only valid when the number of clusters is sufficiently large. The number of observations within each cluster is less important (although singleton clusters are problematic). There is no consensus about how many clusters suffice to make the use of cluster robust variance estimators valid. 15 is a borderline case.

    If I were working with a data set like this, I would probably do it both with and without clustered standard errors and hope that the results don't differ much. If they do differ a great deal, that is a dilemma, and you would probably have to resolve it by seeing what others in your own discipline have done in similar circumstances. There is no mathematically correct answer to this, and different disciplines tend to have different preferences about handling this.

    I should point out that if the underlying regression command here is -xtreg, fe-, there is no distinction between -robust- and -cluster(sector)-: -robust- standard errors without clustering are invalid with -xtreg, fe- and Stata will automatically calculate cluster robust standard errors if you specify robust with -xtreg, fe- (assuming you are using version 13 or later). The real issue is whether to use clustered standard errors or just the ordinary ones.

    Comment


    • #3
      Clyde thank you so much! I can really work from here!
      The underlying regression command was -reg- and later -xtreg, re-, but thanks for the alert answer! I really appreciate it!

      Comment


      • #4
        Assuming you are working with -reg- : ask yourself whether
        1) there are clusters in the underlying population that do not show up in your sample
        2) regressors are (perfectly) correlated across individuals within each cluster
        only if the answer to both questions is no there is no need to cluster (see https://arxiv.org/abs/1710.02926).

        But even if you decide to cluster in light of the considerations above, your results might not be accurate since, as Clyde pointed out, the asymptotics of the cluster-robust estimator involve the number of clusters instead of the number of individuals. Therefore, your cluster-robust standard errors might suffer from severe downward-bias. That is, you are not guaranteed to be on the safe side if the different standard errors are numerically similar.
        As far as I know, Stata applies a "few clusters" correction in order to reduce bias of the cluster-robust variance matrix estimator by default. However, this adjustment is known to not work very well (see, for instance, the simulation study in https://www.mitpressjournals.org/doi...2/REST_a_00552), and therefore you should not rely merely on the comparison between -,robust- and -cluster(sector)-.
        I'd recommend to use more powerful corrections for the few-clusters problem, for instance the Bell-McCaffrey adjustment (see Imbens and Kolesár (2016)), or bootstrap-based inference (https://www.mitpressjournals.org/doi.../rest.90.3.414)

        Comment

        Working...
        X