Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Minimum observations per group in mixed effects regression

    I am working on an unadjusted mixed effects regression looking at changes in several scale scores from baseline to 6-months follow-up and adjusting for clustering at the participant level using long data.

    We have already imputed scores for anyone who had at least 75% of any given scale complete, which has left 1-2 participants with missing data for any given scale at baseline or 6-months. Because of this, in the regression output, there are some groups (i.e., participants) for whom there is one observation (min observations per group = 1) when it should always be two (each time point).

    I am wondering if Stata ignores any groups for which there is missing data at one of the time points, or if I need to restrict to drop these rows and if so, would that mean I need to reshape the data to wide in order to restrict to cases where the scale score complete at both time points.

    I am already restricting to rows with complete data for the outcome and to participants that we are considering "on study" at baseline AND 6-months (base_and_6mo_complete), but this still allows cases for which the variable is only missing at one time point. The mixed effects code is below:

    Code:
    foreach var of varlist OUTCOMES {
        mixed `var' i.VISIT || id: if `var' !=. & base_and_6mo_complete==1
     }

    Thanks in advance for any guidance!

    Last edited by Kristin Bevilacqua; 31 Jul 2025, 13:16.

  • #2
    If you have missing scores at random, there is no need to impute them. In mixed-effects linear models, multiple imputation is usually not helpful.

    You mentioned 'non-adjusted' model, but it is unclear what the term means. How have you treated the baseline measurement? Are you defining the outcome as change from baseline? Or perhaps are you using the baseline as covariate? Or are you including the baseline measurement as the first observation of the repeated measurements?

    In a mixed-effects model with random effects, the inclusion of participants with only baseline measurements helps the estimation of the within-group means, but usually does not impact the treatment effects. You can also include participants with only 6-month scores, which will help the treatment effect estimation at that time point.

    Comment


    • #3
      Thanks for your reply, Taigo.

      Yes, we are defining the outcome as change from baseline to six months. The concern is that some participants have outcome data only at baseline or 6-months, and that is why for some participants (groups), there is only on observation. It seems like you are saying this is okay and not something to worry about in terms of estimating the treatment effects, correct?

      Thank you again!

      Comment


      • #4
        I meant:

        In a mixed-effects model with random effects (with the baseline measurement as the first observation of the repeated measurements), the inclusion of participants with only baseline measurements helps the estimation of the within-group means, but usually does not impact the treatment effects.

        Comment


        • #5
          Understood! Thank you again,
          Kristin

          Comment


          • #6
            Generally, you shouldn't manually drop observations, as most official estimators in Stata, including mixed, are designed to handle unbalanced panels. Doing so may introduce bias into your analysis.

            Comment


            • #7
              Thank you, Andrew!

              Comment

              Working...
              X