Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Analyzing a Clinical Trial with continuous outcome

    Hello everyone,

    I'm currently working on analyzing a clinical trial dataset in Stata and could use some guidance on the best approach to analyze the data, adjusting for baseline values, and implementing a mixed effects model. Here are the key details of the clinical trial:
    1. Study Design:
      • The trial consists of two arms: a placebo arm and a treatment arm (variable name: 'arm').
      • There are four time points: baseline, 6 months, 12 months, and 18 months (6 months after treatment discontinuation at 12 months).
      • Each time point's data is represented as a row in Stata.
      • The primary outcome variable of interest is a continuous measure called 'test1,' which represents damage to the epithelium and ranges from 0 to 100.
    2. Created Variables:
      • For each patient, I have calculated the change in test1 from baseline to each subsequent time point.
      • The change in test1 between each time point and baseline is represented by the variables 'test_change_b6,' 'test_change_b12,' and 'test_change_b18.'
    3. Baseline Variation:
      • Baseline values vary significantly, with a mean of 30 in the treatment arm and a mean of 36 in the control arm. The standard deviations are also similar.
    4. Baseline Diagnosis:
      • Patients in the trial are categorized into two baseline diagnosis groups: mild disease and severe disease.
    Now, I have a few questions regarding the analysis of this dataset:
    1. Which would be the best method to analyze this data, considering the characteristics mentioned above?
    2. Is there any way to adjust for the baseline test values while analyzing the treatment effect over time?
    3. How would I fit a mixed effects model to account for the longitudinal nature of the data and the correlation between repeated measures within individuals?
    Any insights, recommendations, or examples of Stata syntax to perform these analyses would be greatly appreciated. Thank you in advance for your help!

    Best regards,

    Dalton

  • #2
    You do not say either way, but I'm going to assume in this response that the assignment to placebo or active treatment in this trial was randomized. Things will be simpler under that assumption.

    You rightly identify that a mixed effects model will be appropriate. I would start with:
    Code:
    mixed test1 i.arm##i.time || patient_id:
    where arm is a dichotomous variable distinguishing placebo and active treatment, time is a four level categorical variable specifying the four time points, and patient_id is a variable that identifies distinct patients in the study.

    As this is, by my assumption, a randomized trial, it is likely that you do not need to adjust for any covariates. You should, of course, first check whether there actually are any confounders, and if so, include them as covariates. You may also want to include other covariates that, though not confounders, inflate outcome variance if not included. Your treatment effect will be estimated by the coefficients of the three interaction terms, showing the treatment effect at each time period.

    There is no need for change variables. In fact, they are usually a very bad idea. See https://www.fharrell.com/post/errmed/ for explanations of this and other common blunders in statistical analysis.

    There probably is no need to "adjust for" baseline values of test1 in this model. The model itself does that, and, in fact, does it better. There is an alternative analysis that omits the baseline observation altogether and includes the baseline value of test1 as a covariate. However, it is harder to interpret (the coefficient of the interactions is no longer an unbiased estimate of the treatment effect). It is, however, preferable under certain conditions to go this route: that is when the baseline measurement was ascertained in a materially different way than the others. That is, I have seen trials where the baseline value of something is gleaned from medical records, or was done in a different laboratory from the measurements taken at other timepoints in the study. In that situation, the error term for the baseline observation is not exchangeable with that of the other measurements, and the alternative design (eliminate baseline observation, include baseline value of outcome as covariate) is needed. But if all of the timepoints have the outcome variable ascertained in the same way, there is no need to do that, and it is simpler not to.

    You may want to additional look at whether the effect of treatment differs in patients with mild vs severe disease. (Though, frankly, it is rare to see a clinical study with a large enough enrollment to power this kind of analysis.) In that case, you would expand the model to a three way interaction:
    Code:
    mixed test1 i.arm##i.time##i.severity || patient_id:
    margins time#severity, dydx(arm)
    The output of the -margins- command will give you the effect of treatment in each severity class at each time point in your study.

    There is one other issue. The above discussion assumes that your 0-100 variable test1 has a "nice" distribution. By "nice" I mean that it is not bunched up at either end of the scale. (Some extreme observations at 0 or 100 are fine, but if the distribution shows a floor or ceiling effect of the measure, it is going to be difficult to do a good analysis.)

    Comment


    • #3
      Dear Clyde Schechter,

      Thank you so much for your response to my forum post. I truly appreciate your input and attention to detail in clarifying the analysis approach for my clinical trial data. Your insights have been incredibly helpful.
      It worked perfectly you are right for the stratification for severity there is not enough sample size.

      Really appreciate your response.

      Comment

      Working...
      X