Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Different results between mediate Stata 18 and paramed

    Dear statalists,

    I tried to use mediate command in Stata 18, and found that the results are different between mediate and paramed when I used a binary outcome and a binary mediator.
    When I did the same analysis by hand with logit and postestimations, the results are same as paramed.
    Why are there such differences? I guess mediate uses different aproach, but I have no idea.

    There is an example:

    Code:
    use https://www.stata-press.com/data/r18/wellbeing, clear
    *mediate Stata 18
    mediate (bwellbeing age, logit) (bbonotonin age, logit) (exercise), pomeans nde nie te
    estat or
    
    *paramed
    paramed bwellbeing, avar(exercise) mvar(bbonotonin) a0(0) a1(1) m(0) yreg(logistic) mreg(logistic) cvars(age)
    
    // results are different


    Calculate by hand, following to the formula in Explanation in Tyler J. Vanderweele. Causal Inference: Methods for Mediation and Interaction, p29

    Code:
    *fit outcome model
    logit bwellbeing i.exercise##c.bbonotonin age
    sca theta1 = _b[1.exercise]
    sca theta2 = _b[bbonotonin]
    sca theta3 = _b[1.exercise#c.bbonotonin]
    sca theta4 = _b[age]
    
    *fit mediator model
    logit bbonotonin i.exercise age
    sca beta0 = _b[_cons]
    sca beta1 = _b[1.exercise]
    sca beta2 = _b[age]
    
    *define other values
    sca aint = 1
    sca astar = 0
    qui sum age
    sca cmean = `r(mean)'
    
    di "NDE_OR = " = (exp(theta1*aint) *(1 + exp(theta2 + theta3*aint  + beta0 + beta1*astar + beta2*cmean))) / ///
                     (exp(theta1*astar)*(1 + exp(theta2 + theta3*astar + beta0 + beta1*astar + beta2*cmean)))
    
    di "NIE_OR = " = ((1 + exp(beta0 + beta1*astar + beta2*cmean))*(1 + exp(theta2 + theta3*aint + beta0 + beta1*aint  + beta2*cmean))) / ///
                     ((1 + exp(beta0 + beta1*aint  + beta2*cmean))*(1 + exp(theta2 + theta3*aint + beta0 + beta1*astar + beta2*cmean)))
    
    // results are the same as paramed
    Last edited by Yusuke Matsuyama; 26 Jun 2023, 15:13.

  • #2
    Here are the code with main outputs:

    Code:
    . use https://www.stata-press.com/data/r18/wellbeing, clear
    (Fictional well-being data)
    
    . *mediate Stata 18
    . qui mediate (bwellbeing age, logit) (bbonotonin age, logit) (exercise), pomeans nde nie te
    . estat or
    
    Transformed treatment effects                            Number of obs = 2,000
    ----------------------------------------------------------------------------------------
                           |               Robust
                bwellbeing | Odds ratio   std. err.      z    P>|z|     [95% conf. interval]
    -----------------------+----------------------------------------------------------------
    NIE                    |
                  exercise |
    (Exercise vs Control)  |   1.516983   .1762198     3.59   0.000     1.208095    1.904847
    -----------------------+----------------------------------------------------------------
    NDE                    |
                  exercise |
    (Exercise vs Control)  |   1.979511   .2943225     4.59   0.000       1.4791    2.649222
    -----------------------+----------------------------------------------------------------
    TE                     |
                  exercise |
    (Exercise vs Control)  |   3.002884   .2809796    11.75   0.000     2.499722    3.607326
    ----------------------------------------------------------------------------------------
    
    . *paramed
    . paramed bwellbeing, avar(exercise) mvar(bbonotonin) a0(0) a1(1) m(0) yreg(logistic) mreg(logistic) cvars(age)
    
    Iteration 0:  Log likelihood = -1370.3761
    Iteration 1:  Log likelihood = -1284.0941
    Iteration 2:  Log likelihood =  -1283.828
    Iteration 3:  Log likelihood =  -1283.828
    
    Logistic regression                               Number of obs   =       2000
                                                      LR chi2(4)      =     173.10
                                                      Prob > chi2     =     0.0000
    Log likelihood =  -1283.828                       Pseudo R2       =     0.0632
    
    ------------------------------------------------------------------------------
      bwellbeing | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
        exercise |   .6700708   .1676859     4.00   0.000     .3414125     .998729
      bbonotonin |   .3877229   .2145402     1.81   0.071    -.0327682    .8082141
    _exercise_~n |   .2025021    .270316     0.75   0.454    -.3273076    .7323117
             age |   .0211115   .0053646     3.94   0.000      .010597     .031626
           _cons |  -1.865164    .263223    -7.09   0.000    -2.381071   -1.349256
    ------------------------------------------------------------------------------
    
    Iteration 0:  Log likelihood = -1383.6922
    Iteration 1:  Log likelihood = -830.54667
    Iteration 2:  Log likelihood = -807.92863
    Iteration 3:  Log likelihood = -807.28109
    Iteration 4:  Log likelihood =    -807.28
    
    Logistic regression                               Number of obs   =       2000
                                                      LR chi2(2)      =    1152.82
                                                      Prob > chi2     =     0.0000
    Log likelihood =    -807.28                       Pseudo R2       =     0.4166
    
    ------------------------------------------------------------------------------
      bbonotonin | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
        exercise |   3.712399   .1358174    27.33   0.000     3.446202    3.978597
             age |   .0324635   .0072884     4.45   0.000     .0181785    .0467485
           _cons |  -3.654159   .3683806    -9.92   0.000    -4.376172   -2.932146
    ------------------------------------------------------------------------------
    
                 |   Estimate     Std Err   P>|z|   [95% Conf   Interval]
    -------------+-------------------------------------------------------
             cde |  1.9543756   .16768588   0.000   1.4069249   2.7148456
             nde |  2.0183921   .14345349   0.000    1.523686   2.6737181
             nie |   1.536077   .11972601   0.000   1.2147871   1.9423426
             mte |  3.1004058   .09634844   0.000   2.5668772    3.744829
    
    cde:controlled direct effect, nde:natural direct effect, nie:natural indirect effect, mte:marginal total effect
    
    . // results are different
    
    . *fit outcome model
    . qui logit bwellbeing i.exercise##c.bbonotonin age
    . sca theta1 = _b[1.exercise]
    . sca theta2 = _b[bbonotonin]
    . sca theta3 = _b[1.exercise#c.bbonotonin]
    . sca theta4 = _b[age]
     
    . *fit mediator model
    . qui logit bbonotonin i.exercise age
    . sca beta0 = _b[_cons]
    . sca beta1 = _b[1.exercise]
    . sca beta2 = _b[age]
     
    . *define other values
    . sca aint = 1
    . sca astar = 0
    . qui sum age
    . sca cmean = `r(mean)'
    
    . di "NDE_OR = " = (exp(theta1*aint) *(1 + exp(theta2 + theta3*aint  + beta0 + beta1*astar + beta2*cmean))) / ///
    >                  (exp(theta1*astar)*(1 + exp(theta2 + theta3*astar + beta0 + beta1*astar + beta2*cmean)))
    NDE_OR = 2.018392
    
    . di "NIE_OR = " = ((1 + exp(beta0 + beta1*astar + beta2*cmean))*(1 + exp(theta2 + theta3*aint + beta0 + beta1*aint  + beta2*cmean))) / ///
    >                  ((1 + exp(beta0 + beta1*aint  + beta2*cmean))*(1 + exp(theta2 + theta3*aint + beta0 + beta1*astar + beta2*cmean)))
    NIE_OR = 1.536077
    
    . // results are the same as paramed

    Comment


    • #3
      mediate indeed uses a different approach. I don't have the Vanderweele reference handy right now, but if I remember correctly, their formulas for logit outcome models are derived using an approximation that only holds if the outcome is rare. In contrast, mediate with logit outcome model computes effects based on a derivation from the logistic density which holds generally, not just for rare outcomes. Another difference is that mediate calculates population average effects/odds ratios (by averaging over the covariate distributions), whereas the odds ratios from paramed are conditional on covariate means. For details about the mediate implementation, please have a look at the methods and formulas section of the mediate manual:
      https://www.stata.com/manuals/causalmediate.pdf

      I hope this helps,
      Joerg

      Comment


      • #4
        Dear Joerg,

        Thank you for your reply. I checked the mediate manual and tried to get the point estimates by hand for my understanding, but I was not able to get them.
        I think there are mistakes in my code. Could you please let me know how to calculate the same point estimates as mediate? Covariates were removed to make the model simple.


        Code:
        . use https://www.stata-press.com/data/r18/wellbeing, clear
        (Fictional well-being data)
        
        . mediate (bwellbeing , logit) (bbonotonin , logit) (exercise) , pomeans
        
        Iteration 0:  EE criterion =  2.047e-17  
        Iteration 1:  EE criterion =  3.476e-32  
        
        Causal mediation analysis                                Number of obs = 2,000
        
        Outcome model:     Logit
        Mediator model:    Logit
        Mediator variable: bbonotonin
        Treatment type:    Binary
        ---------------------------------------------------------------------------------------
                              |               Robust
                   bwellbeing | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
        ----------------------+----------------------------------------------------------------
        POmeans               |
                         Y0M0 |   .3038105    .014759    20.58   0.000     .2748835    .3327375
                         Y1M0 |   .4496044   .0325748    13.80   0.000      .385759    .5134498
                         Y0M1 |   .3730057   .0391666     9.52   0.000     .2962406    .4497708
                         Y1M1 |   .5626822    .015464    36.39   0.000     .5323733    .5929911
        ---------------------------------------------------------------------------------------
        Note: Outcome equation includes treatment–mediator interaction.
        
        . *ymodel
        . qui logit bwellbeing i.exercise##c.bbonotonin
        
        . sca beta0 = _b[_cons]
        . sca beta1 = _b[1.exercise]
        . sca beta2 = _b[bbonotonin]
        . sca beta3 = _b[1.exercise#c.bbonotonin]
         
        . *mmodel 
        . qui logit bbonotonin i.exercise
        
        . sca alpha0 = _b[_cons]
        . sca alpha1 = _b[1.exercise]
        . sca ay = 0
        . sca am = 0
        . sca v = beta0 + beta1*ay 
        . sca e = alpha0 + alpha1*am 
        . sca k = beta2 + beta3*ay
        
        . di "POmean Y0M0 = " logistic(v + k)*logistic(e) + logistic(v)*(1-logistic(e))
        POmean Y0M0 = .34188937
        
        // this is different from POmean from mediate: Y0M0 | .3038105

        Comment


        • #5
          Sorry to post two times, I solved it myself. The scalars v, e, k should be named as vt, et, kt, otherwise "e" refferes "exercise" afterwards. Then, the output is the same as the point estimate from mediate.

          Comment

          Working...
          X