Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • RCT with repeated measures help

    Hi everyone. I may be overthinking this and going down too many rabbit holes of possibilities, but can someone please help me figure out the appropriate ("best") method to use for this research project?? It's for a randomized control trial with two groups (treatment and control, 60 in each so 120 total), and repeated measures. Note that we do expect a good deal of attrition, so missing data will need to be considered. The outcomes are both continuous measures. The primary outcome will be measures at baseline and post-treatment. The secondary outcome will be measured at two follow-ups that occur after the post-treatment assessment (though I can probably argue to have this one also measured at baseline and/or post-treatment in addition to the two follow-ups).

    For the primary outcome, what is the best way to account for the baseline measure of the outcome? Is it as simple as running an ANOVA with group as the main predictor and the baseline as a covariate? Or should this be done in some kind of mixed model?

    For the primary, what are my options if we only have the two follow-up measures of the outcome? And what would be the best method if we also assessed this at baseline and/or the post-treatment assessment?

    Right now there are no plans to include any covariates, but I'm wondering if that's the best approach- we could, for example, include any that differ between the two groups at baseline and have a relation to the outcome.

    Any suggestions for best methods and links to Stata resources would be greatly appreciated!

    Finally, is there an easy way to conduct a power analysis for this, either in Stata or elsewhere?

    Thank you!!

  • #2
    Originally posted by Jennifer Carr View Post
    For the primary outcome, what is the best way to account for the baseline measure of the outcome? Is it as simple as running an ANOVA with group as the main predictor and the baseline as a covariate? Or should this be done in some kind of mixed model?
    Take a look at this post for advice on these questions.

    For the [secondary], what are my options if we only have the two follow-up measures of the outcome? And what would be the best method if we also assessed this at baseline and/or the post-treatment assessment?
    I think that Clyde's suggestions in that post would be just as pertinent here, too.

    Finally, is there an easy way to conduct a power analysis for this, either in Stata or elsewhere?
    Have you had a chance to look at the help file for Stata's official power analysis for repeated-measures analysis of variance? If not, then that might be a good place to start on this.

    Comment


    • #3
      My view is nothing special is required. I would analyze each outcome separately -- that is, each time period separately -- and control for the base period outcome to improve efficiency. With an RCT, you could just compare the averages in each time period between the treated and control. But, done properly, regression adjustment can (but need not) shrink standard errors.

      The mechanics differ by whether the data are in "long" or "wide" format. Is this stored as a panel data set, with a different record for each time period, or is there just one record per individual (wide)?

      Here's an example assuming wide format:

      Code:
      sum outcome_base
      gen outcome_base_dm = outcome_base - r(mean)
      reg outcome_t i.treat c.outcome_base i.treat#c.outcome_base, vce(robust)
      margins, dydx(treat) vce(uncond)
      Or, even easier to obtain the ATE without the other regression coefficients:

      Code:
      teffects ra (outcome_t outcome_base) (treat), ate

      Comment


      • #4
        To supplement the excellent answers of Joseph Coveney and Jeff Wooldridge and to show something that Clyde Schechter mentioned in the linked post, I simulated some data that shows the equivalence of the two approaches and demonstrates how the ICC is a scaling factor that allows you to go from the mixed treatment effect estimate to the OLS treatment effect estimate. Jennifer Carr, this could be used as the setup for a power analysis.
        Code:
        version 16.1
        clear*
        set seed 869703
        
        *Sample size - set to whatever you want; I go big for precision
        set obs 4800
        gen id = _n
        
        *Treatment assignment to half the sample
        gen trtmt = 0
        replace trtmt = 1 if id>2400
        
        *Two measurements occasions (occ) per id
        gen obs_per_id = 2
        expand obs_per_id
        by id, sort: gen occ = _n-1
        
        *simulate random intercept and residual based on a chosen ICC (rho)
        local rho = 0.4
        local sd_u = sqrt(`rho')
        local sd_e = sqrt(1-`rho')
        
        by id (occ), sort: gen u = rnormal(0, `sd_u') if _n == 1
        by id (occ): replace u = u[1]
        gen e = rnormal(0, `sd_e')
        
        *generate outcome w/ treatment effect turning on at occ==1
        gen y = .4*trtmt*occ + u + e
        Now you can estimate the treatment effect with mixed.
        Code:
        mixed y i.trtmt || id:, stddev 
        Mixed-effects ML regression                     Number of obs     =      9,600
        Group variable: id                              Number of groups  =      4,800
        
                                                        Obs per group:
                                                                      min =          2
                                                                      avg =        2.0
                                                                      max =          2
        
                                                        Wald chi2(1)      =      51.20
        Log likelihood = -13514.461                     Prob > chi2       =     0.0000
        
        ------------------------------------------------------------------------------
                   y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
             1.trtmt |   .1744102   .0243754     7.16   0.000     .1266352    .2221852
               _cons |   .0255855    .017236     1.48   0.138    -.0081965    .0593675
        ------------------------------------------------------------------------------
        
        ------------------------------------------------------------------------------
          Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
        -----------------------------+------------------------------------------------
        id: Identity                 |
                           sd(_cons) |   .6145709   .0130847      .5894532     .640759
        -----------------------------+------------------------------------------------
                        sd(Residual) |   .8188973   .0083578       .802679    .8354433
        ------------------------------------------------------------------------------
        LR test vs. linear model: chibar2(01) = 667.44        Prob >= chibar2 = 0.0000
        Use the ICC scaling factor of 0.4 to get the equivalent OLS treatment effect estimate:
        Code:
        lincom _b[1.trtmt]/.4
        
         ( 1)  2.5*[y]1.trtmt = 0
        
        ------------------------------------------------------------------------------
                   y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                 (1) |   .4360255   .0609386     7.16   0.000     .3165881    .5554629
        ------------------------------------------------------------------------------
        Reshape to wide and run ANCOVA:
        Code:
        reshape wide y e, i(id) j(occ)
        *OLS
        reg y1 i.trtmt y0
        
              Source |       SS           df       MS      Number of obs   =     4,800
        -------------+----------------------------------   F(2, 4797)      =    556.09
               Model |  954.749259         2  477.374629   Prob > F        =    0.0000
            Residual |  4118.01281     4,797  .858455871   R-squared       =    0.1882
        -------------+----------------------------------   Adj R-squared   =    0.1879
               Total |  5072.76207     4,799  1.05704565   Root MSE        =    .92653
        
        ------------------------------------------------------------------------------
                  y1 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
             1.trtmt |    .435298   .0267591    16.27   0.000     .3828378    .4877581
                  y0 |   .3894585   .0131588    29.60   0.000     .3636613    .4152558
               _cons |  -.0059857   .0189204    -0.32   0.752    -.0430785     .031107
        ------------------------------------------------------------------------------
        The ANCOVA estimate and mixed estimates (once accounting for the scaling factor) are within .001 of each other. You can stay with mixed and get the near equivalent ANCOVA estimate by specifying a treatment by occasion interaction. Note that this is not exactly equivalent because we aren't adjusting for y0, and instead are getting a separate treatment effect estimate for y0 and y1.
        Code:
        reshape long
        mixed y i.trtmt##i.occ || id: , stddev
        
        Mixed-effects ML regression                     Number of obs     =      9,600
        Group variable: id                              Number of groups  =      4,800
        
                                                        Obs per group:
                                                                      min =          2
                                                                      avg =        2.0
                                                                      max =          2
        
                                                        Wald chi2(3)      =     430.63
        Log likelihood =  -13331.87                     Prob > chi2       =     0.0000
        
        ------------------------------------------------------------------------------
                   y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
             1.trtmt |  -.0622383    .029208    -2.13   0.033    -.1194849   -.0049918
               1.occ |   -.031101   .0227572    -1.37   0.172    -.0757042    .0135023
                     |
           trtmt#occ |
                1 1  |   .4732971   .0321835    14.71   0.000     .4102186    .5363756
                     |
               _cons |    .041136   .0206532     1.99   0.046     .0006566    .0816155
        ------------------------------------------------------------------------------
        
        ------------------------------------------------------------------------------
          Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
        -----------------------------+------------------------------------------------
        id: Identity                 |
                           sd(_cons) |   .6342401   .0125158      .6101779    .6592511
        -----------------------------+------------------------------------------------
                        sd(Residual) |   .7883315   .0080459      .7727186      .80426
        ------------------------------------------------------------------------------

        Comment

        Working...
        X