Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Margins from a mixed model with a log-transformed outcome

    I run a mixed model of a log-transformed outcome using ‘mixed’ (and the dfmethod(kroger) option with REML estimation because the sample size is only 36). I want to calculate predictive margins with confidence intervals. The ‘margins’ command produces estimates on the log scale so I cannot just back-transform them as E(Y|X) = EXP(XB). I figure I should instead estimate them as E(Y|X) = EXP(XB)*EXP(σ^2/2), where σ^2 is the variance of the errors.

    How can I obtain σ^2 and incorporate it in a margins command? It will probably be something like this:

    Code:
    margins, expression(exp(predict(xb))*(exp(???)))
    This is my model command, simplified:

    Code:
    mixed lnyvar i.xvar1 i.xvar2 || id:, reml dfmethod(kroger)
    Grateful for your help!

  • #2
    On retrieving the var(Residual) estimate after mixed:

    Code:
    webuse pig
    mixed weight week || _all: R.id || _all: R.week, reml dfmethod(kroger)
    di exp(`=e(b)[1, "lnsig_e:_cons"]')^2
    Res.:

    Code:
    ------------------------------------------------------------------------------
      Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
    -----------------------------+------------------------------------------------
    _all: Identity               |
                       var(R.id) |   15.15273   3.224447       9.98523    22.99448
    -----------------------------+------------------------------------------------
    _all: Identity               |
                     var(R.week) |   .1113996   .1075948      .0167785    .7396272
    -----------------------------+------------------------------------------------
                   var(Residual) |   4.296981   .3133894       3.72463    4.957282
    ------------------------------------------------------------------------------
    LR test vs. linear model: chi2(2) = 476.10                Prob > chi2 = 0.0000
    
    Note: LR test is conservative and provided only for reference.
    
    .
    . di exp(`=e(b)[1, "lnsig_e:_cons"]')^2
    4.2969807

    Comment


    • #3
      Thanks Andrew! I doubt if I can just insert the value in a margins command. I think it would include it as though it were a known value measured without error. I'm pretty sure it needs to account for the standard error of the variance, but I don't know how to achieve that.

      Comment


      • #4
        I do not really follow what you are doing, so the following may not be relevant. If you run

        Code:
        margins, expression(exp(predict(xb))
        are you not just averaging the predictions over observations? For example:

        Code:
        webuse pig, clear
        mixed weight week || _all: R.id || _all: R.week, reml dfmethod(kroger)
        margins if id==2, expression(exp(predict(xb))) 
        margins if id==2, expression(exp(predict(xb))) by(week) post
        local id2obs 1.week
        forval i=2/9{
            local id2obs "`id2obs' + `i'.week"
        }
        lincom (`id2obs')/9
        Res.:

        Code:
        . margins if id==2, expression(exp(predict(xb))) 
        
        Predictive margins                              Number of obs     =          9
        
        Expression   : exp(predict(xb))
        
        ------------------------------------------------------------------------------
                     |            Delta-method
                     |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
               _cons |   5.31e+31   3.32e+31     1.60   0.110    -1.20e+31    1.18e+32
        ------------------------------------------------------------------------------
        
        . 
        . margins if id==2, expression(exp(predict(xb))) by(week) post
        
        Predictive margins                              Number of obs     =          9
        
        Expression   : exp(predict(xb))
        over         : week
        
        ------------------------------------------------------------------------------
                     |            Delta-method
                     |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                week |
                  1  |   1.27e+11   7.93e+10     1.60   0.110    -2.87e+10    2.82e+11
                  2  |   6.31e+13   3.83e+13     1.65   0.099    -1.19e+13    1.38e+14
                  3  |   3.14e+16   1.86e+16     1.69   0.092    -5.08e+15    6.79e+16
                  4  |   1.56e+19   9.13e+18     1.71   0.087    -2.27e+18    3.35e+19
                  5  |   7.77e+21   4.52e+21     1.72   0.085    -1.08e+21    1.66e+22
                  6  |   3.87e+24   2.26e+24     1.71   0.087    -5.61e+23    8.30e+24
                  7  |   1.93e+27   1.14e+27     1.69   0.092    -3.12e+26    4.16e+27
                  8  |   9.58e+29   5.81e+29     1.65   0.099    -1.81e+29    2.10e+30
                  9  |   4.77e+32   2.98e+32     1.60   0.110    -1.08e+32    1.06e+33
        ------------------------------------------------------------------------------
        
        . 
        . local id2obs 1.week
        
        . 
        . forval i=2/9{
          2. 
        .     local id2obs "`id2obs' + `i'.week"
          3. 
        . }
        
        . 
        . lincom (`id2obs')/9
        
         ( 1)  .1111111*1bn.week + .1111111*2.week + .1111111*3.week + .1111111*4.week + .1111111*5.week +
               .1111111*6.week + .1111111*7.week + .1111111*8.week + .1111111*9.week = 0
        
        ------------------------------------------------------------------------------
                     |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                 (1) |   5.31e+31   3.32e+31     1.60   0.110    -1.20e+31    1.18e+32
        ------------------------------------------------------------------------------

        Comment


        • #5
          Well, I guess the margins themselves should be possible to calculate correctly using the point estimate of the error variance, as you suggested in #2. They are group averages of the observations. But I think correct confidence intervals need to take the standard error of the point estimate of the variance into account as well. The error variance presented in the model output is also an estimate, with a random error described by its standard error. So I need the margins command to also take that error into account.

          Can someone else confirm my reasoning and suggest a solution?

          Comment


          • #6
            In StataNews 34-2, Chuck Huber made this argument (In the spotlight: Interpreting models for log-transformed outcomes). He suggested the following margins command after running gsem (lnwage is the outcome.
            Code:
            margins, expression(exp(predict(eta))*(exp((_b[/var(e.lnwage)])/2)))
            Huber:
            gsem also estimated the standard error of that variance, and margins will incorporate that standard error into its calculations
            My problem is that I run mixed instead of gsem and the code provided by Huber therefore doesn't work. I'm looking for the appropriate code to make margins integrate the standard error of the estimated error variance. I assume that the point estimates are correct without this correction, while the confidence intervals are incorrect .

            Grateful for any help.

            Comment


            • #7
              The residual variance's point estimate after -mixed- is obtained from the coefficient vector.
              Code:
              exp(_b[lnsig_e:_cons])^2
              I can't vouch for its correctness for your intended use, but the analogue of Chuck Huber's -margins- expression would be (after fitting the model with -mixed-)
              Code:
              tempname var
              scalar define `var' = exp(_b[lnsig_e:_cons])^2
              margins <whatever>, expression( exp(predict(xb)) * ( exp( (`var')/2) ) )
              Have you examined the residuals
              Code:
              predict double residuals, residuals
              pnorm residuals
              qnorm residuals
              after fitting the -mixed- model to the untransformed outcome variable values? And they're not acceptably normal-like?

              One hitch that I would think complicates the analysis of transformed hierarchical outcome data is that usually you assume that it's only the residuals that are skew, and the random effect of the cluster is still normally distributed. That is, the data-generating process is something like
              Code:
              observed value = <fixed effects> + random effect (assumed normal) + exp(normally distributed error)
              and so logarithmically transforming the observed values makes a mess of the random effect's estimates. If the quantile plots of the predicted residuals are unacceptable, then have you considered just leaving things in the log-transformed state, and interpreting, reporting and discussing from there?

              Comment


              • #8
                Thanks a lot, Joseph!

                The code you provided does the trick and I do get reasonable confidence intervals. But even more interesting is the rest of you message. When testing the same model with the outcome variable untransformed, I get roughly Normal first- and second-level residuals - at least as close to Normal as with the transformation. The margins actually look much the same, although there are some important differences. Nevertheless, this convinces me to construct a model with the outcome variable untransformed. Only that I will do it once again from start, accounting for possible confounding by extraneous factors. Great, this really helped me!

                Comment

                Working...
                X