Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How big should my clusters be in order to use cluster robust standard errors?

    So I read somewhere that if I have small clusters, and I use cluster robust SE, it isnt valid anyway.

    Say I have 12 Groups, and each group has 8 participants (2 of them have 7), so total 94 observations with 10 8-subject groups and 2 7-subject groups.
    Would this be considered small? What should I be doing instead of cluster robust SE?
    Last edited by Ben Cheng; 30 Sep 2017, 22:46.

  • #2
    The size of the clusters is not that important (although clusters with only 1 observation are a problem). More important is the number of clusters. There is no hard and fast rule that everybody agrees upon. Probably most would say that 12 is too few, though some would say it is just barely sufficient.

    Comment


    • #3
      Clyde Schechter Hello Clyde, do you have a reference saying that clusters with only 1 observation are a problem?

      Comment


      • #4
        A good reference on this topic in general is the following review article: A. Colin Cameron and Douglas L. Miller (2015) A Practitioner’s Guide to Cluster- Robust Inference, THE JOURNAL OF HUMAN RESOURCES • 50 • 2

        Comment


        • #5
          Thank you, Stephen. In my case I have a lot (>1 million) of clusters, but many of them include one observation only: the clusters (sampling units) are mothers, while the individual observations are pregnancies.

          Comment


          • #6
            For the specific question that Andrea Discacciati asks, see https://www.stata.com/statalist/arch.../msg00594.html. There, StataCorp's Vince Wiggins explains how the robust VCE is calculated and why, in the presence of singleton clusters, it is singular. The consequence is that you will get missing values for overall model tests, and perhaps for other simultaneous tests of multiple hypotheses. Nevertheless, when you do get a result, it is reliable; it's just that some statistics you might be interested in cannot be calculated. (The explanation given there is actually for a different but related situation: an indicator variable that is all zero except for 1 observation. But the reasoning is exactly the same for singleton clusters.)

            Comment


            • #7
              Thank you all for the information

              Comment

              Working...
              X