Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Random effects v.s clustering

    Dear all

    I'm analysing the childs probability to be in school using meprobit with random effects for the household and community level.
    Could anyone simply explain to me (or give a reference) to the difference between using thise type of model and cluster the standard errors at household and community level?

    I'm sorry If my question is really basic.

    Best regards /Elin Vimefall

  • #2
    Elin: I assume that each household belongs to a unique community, in which case one clusters at the community level -- if in fact the sample is a cluster sample at the community level. Or, if you have included community-level covariates in the analysis. It is always a good idea, I think, to use a pooled probit and cluster the standard errors, whether at the household or community level.

    Are you allowing random slopes in the model, or just a random intercept? If just a random intercept, you can get most of what you want with the pooled probit and clustering: Just add any household or community-level variables directly to the model.

    Other than computational, there are a couple of drawbacks to the mixed model. First, it assumes a lot of normality on the heterogeneity, and it does so only to try to enhance efficiency. If clustering a simple probit gives you tight enough confidence intervals, I would go with that. Second, there is some ambiguity about how marginal effects are computed in the mixed models. By contrast, the average partial (marginal) effects for probit are standard and widely agreed on.

    Which variables are you mainly interested in? Individual? Household? Community? It makes a difference. JW

    Comment


    • #3
      There's some good discussion of these issues here:

      http://www.stata.com/meeting/13uk/nichols_crse.pdf

      http://www.ats.ucla.edu/stat/stata/library/cpsu.htm

      http://cameron.econ.ucdavis.edu/rese...ober152013.pdf
      __________________________________________________ __
      Assistant Professor, Department of Biostatistics and Epidemiology
      School of Public Health and Health Sciences
      University of Massachusetts- Amherst

      Comment


      • #4
        Thanks for your help!

        Each child belonges to a household and each household belonges to a small community (group of ten households).
        I only use random intercepts (one for household and one for community).

        meprobit enrolled X1 X2 X3 || id_community: || household:


        One follow up question: is it possible to cluster on both household and community?


        Best regards /Elin Vimefall

        Comment


        • #5
          If you cluster at the community level then clustering at the household level is redundant. Pooled probit with community-level clustering makes the fewest assumptions, but you are giving away some efficiency.

          Comment


          • #6
            Thank you Jeff and Andrew, your answers really helped me!

            Best regards /Elin Vimefall

            Comment


            • #7
              Originally posted by Jeff Wooldridge View Post
              If you cluster at the community level then clustering at the household level is redundant. Pooled probit with community-level clustering makes the fewest assumptions, but you are giving away some efficiency.
              Thank you for these great posts. I have a related question, and would love your take, Jeff. For multilevel data (I am using data on students within schools), is it appropriate to run a mixed linear model with a random intercept and random slopes, AND also cluster standard errors at school level — or is clustering std errors redundant given the mixed model? Also let’s assume the distribution (and even the realization) of covariates is identical across clusters.

              My understanding from Abadie et al. is that the random intercept will account for dependence within a cluster, but this is neither necessary nor sufficient to make an appropriate decision about clustering standard errors because what matters is the product of the covariates and residuals. Is there any harm in vce(cluster school) when using a mixed model?

              Comment

              Working...
              X