Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Obtaining SD using margins (delta-SE) after mixed

    Is it possible to obtain means and standard deviations using -margins- after a -mixed- model that only includes a random y-intercept term to accommodate repeated observations within subject, over time? Assume no random slope terms here--just an intercept to adjust for the within-subjects design.

    For example:
    y = outcome (continuous/normal, collected at 2 or more time periods)
    group = indicator variable for group (example: 2 group design)
    time = indicator variable for time (exanmple: 3 times)
    ID = variable containing subject IDs

    mixed y i.group##i.time,||ID:,vce(repeated) reml

    Assuming that the group#time term is significant... I would typically follow this by:

    -margins group#time

    to get a table of estimated marginal means, delta-se, and 95% CI's, and I typically graph/publish data as mean(+/- 95% CI's)
    (I might also do some pairwise contrasts, but my question is not about that...)

    I rec'd an email from a researcher who is attempting to extract means and sd's from one of my pubs for the purpose of conducting a meta analysis, and I'm not up to speed on current thinking of converting delta-method SE's in a mixed model like this (random y-intercept only) to SDs. My initial instinct is to just tally up the n-per time per group (can differ if any missing data over time) and use the typical SD = SE*sqrt(n), but I wanted to verify with colleagues here on the appropriateness (or not) of that approach?

    (1) is SD = SE*sqrt(n) appropriate in this context?
    (2) is there a way to automate this using -margins- or some other command?


    Much obliged for your assistance.




  • #2
    Can you clarify what you are trying to calculate the SD of? -margins- calculates an estimate of the expected outcome in each group. The thing that is being estimated is a parameter of the regression model; it is not data and it does not, in frequentist statistics, have a probability distribution whose standard deviation can be calculated or estimated. What it does have is a sampling distribution: how would the estimates of it vary over repeated sampling? The standard deviation of that sampling distribution is known as the standard error.

    The formula you quote in your first question is the result of a theorem that says that if you have a simple random sample of data on a variable X, then the standard error (i.e. standard deviation of the sampling distribution) of the estimated mean of X) is SD/sqrt(n). But that formula doesn't apply in this context because you don't have a random sample of estimates (predictions) of the expected outcome. There is no such thing as the standard deviation of the expected value of X, because the expected value of X is a fixed (but unknown) constant, not a random variable, and it does not have a probability distribution if we are working within the domain of frequentist statistics. So you could calculate SE*sqrt(n), but there is no random variable for which that would be the standard deviation.

    Comment


    • #3
      Thank you Clyde. This makes perfect sense!

      If you'll indulge me a bit further... Indeed, marginal means are modeled-derived means, pluging in the values of the indicator variables per predictor, and working the model equation (using REML or FEML if mixed).

      But if the model that is being used only has factor variables (ex. group has two levels, time treated as indicator variable with 2 or more levels), and no continuously scaled covariates, and no missing data among observations, then margins = means of the data. This becomes most obviously apparent in non-mixed methods (ex. ANOVA), or perfectly balanced mixed models with no missing data. In those scenarios, it is possible and can be useful to use the means and standard deviations in order to calculate effect sizes--possibly for powering a future study, or (in this case, by another researcher) for extracting information from data or published study to contribute to a larger meta analysis.

      That is, in particular, what I'm trying to do here. I've published some results of an RCT, including means and confidence intervals in a table. Another researcher is asking for SD's instead of SE's because s/he wants to use our data in a larger meta analysis.

      Is this context more helpful?

      Comment


      • #4
        i've also looked for a way to express the between-subject variability when summarizing panel (repeated-over-time) data. the xtsum command will do this. if i'm trying to summarize differences based on a factor variable, i have used xtmixed or xtreg Y i.x to test the coefficient, then a series of xtsum Y if x==1, x==2, etc to get the subgroup means and sd's. i don't think this is highly appropriate but some people/reviewers want data with sd's.
        i hope this thread progresses to discuss a range of generalizable approaches. mine is not likely a good one.

        Comment


        • #5
          So you can do something like this:

          Code:
          predict u, reffects
          
          clonevar group0 = group
          clonevar time0 = time
          
          levelsof group, local(groups)
          levelsof time, local(times)
          foreach g of local groups {
              foreach t of local times {
                  replace group = `g'
                  replace time = `t'
                  predict yhat, xb
                  replace yhat = yhat + u
                  display "Results for group `g', time `t'"
                  summ yhat
                  drop yhat
              }
          }
          
          replace group = group0
          replace time = time0
          drop group0 time0
          Note: Not tested; beware of typos, unbalanced braces, etc.

          I have not done any metanalysis work in well over a decade now, so I am not 100% sure that this is what is wanted. I'm inferring from the description of the problem that the desired results are the adjusted means and standard deviations of xb + u (fixed effects + random intercepts) by group#time group. That is what the above delivers. I offer no opinion as to whether these represent suitable ingredients for a metanalysis.

          Comment


          • #6
            Thank you Clyde! The `u' above is what I needed most...

            Much appreciated.
            Rob

            Comment

            Working...
            X