Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mixed Effects Binomial Regression

    Hello,

    I am am trying to find the right code to do a mixed effect binomial regression to give the relative risk of an outcome, but I am struggling to find the right mixed effect model.
    I essentially want binreg picu_mort i.cpi_poprank_quintile i.ethnic_cat i.age_cat i.sex i.season i.primarydiag_num c.log_pim2 i.epoch, rr with a random intercept by patientid. This is because my outcome is death and the same patient is captured multiple times in the dataset with only the final admission having the potential to result in death so I want to account for this.

    I unsuccessfully tried the following work arounds
    gllamm picu_mort i.cpi_poprank_quintile i.ethnic_cat i.age_cat i.sex i.season i.primarydiag_num i.epoch, rr i(patientid) fam(binomial) link(log)
    factor-variable and time-series operators not allowed

    xi: gllamm picu_mort i.cpi_poprank_quintile i.ethnic_cat i.age_cat i.sex i.season i.primarydiag_num i.epoch log_pim2, i(patientid) fam(binomial) link(log)
    i.cpi_poprank~e _Icpi_popra_1-5 (naturally coded; _Icpi_popra_1 omitted)
    i.ethnic_cat _Iethnic_ca_1-5 (naturally coded; _Iethnic_ca_1 omitted)
    i.age_cat _Iage_cat_1-5 (naturally coded; _Iage_cat_1 omitted)
    i.sex _Isex_1-2 (naturally coded; _Isex_1 omitted)
    i.season _Iseason_1-2 (naturally coded; _Iseason_1 omitted)
    i.primarydiag~m _Iprimarydi_1-6 (naturally coded; _Iprimarydi_1 omitted)
    i.epoch _Iepoch_1-3 (naturally coded; _Iepoch_1 omitted)
    estimates diverging

    From doing some reading my understanding is that the following code mixed effects generalised linear model with poisson family gives an approximation of relative risk
    meglm picu_mort i.cpi_poprank_quintile i.ethnic_cat i.age_cat i.sex i.season i.primarydiag_num c.log_pim2 i.epoch || patientid:, family(poisson) link(log) eform
    However this does not give RR as an output but IRR

    I would be really grateful for any advice on whether this is the best approach or if there are any other alternative methods for making a mixed effects binomial regression model.

    Many thanks

    Hannah

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(picu_mort cpi_poprank_quintile ethnic_cat age_cat sex season) long primarydiag_num float(log_pim2 epoch) long patientid
    0 2 1 3 2 1 2 -3.5771534 1 11
    0 2 1 3 2 1 2 -1.1086035 1 11
    0 . 4 5 1 2 4 -2.7487965 2 18
    0 . 4 5 1 2 4  -5.116813 2 18
    0 . 4 5 1 1 6 -1.9823108 2 18
    1 . 4 5 1 1 4  -3.421944 2 18
    0 5 1 2 1 1 6  -4.244526 1 21
    0 2 1 2 2 1 1 -3.6874056 1 31
    0 . 1 5 2 1 6 -3.5771534 1 42
    0 2 1 2 1 1 1  -3.039826 1 52
    end
    label values picu_mort picu_mort
    label def picu_mort 0 "alive", modify
    label def picu_mort 1 "dead", modify
    label values ethnic_cat ethnic_cat
    label def ethnic_cat 1 "Any White", modify
    label def ethnic_cat 4 "Any Black", modify
    label values age_cat age_cat
    label def age_cat 2 "29 days to 1 year", modify
    label def age_cat 3 "1-4 years", modify
    label def age_cat 5 "11-15 years", modify
    label values sex sex
    label def sex 1 "Male", modify
    label def sex 2 "Female", modify
    label values season season
    label def season 1 "Winter", modify
    label def season 2 "Summer", modify
    label values primarydiag_num primarydiag_num
    label def primarydiag_num 1 "Cardiovascular", modify
    label def primarydiag_num 2 "Gastrointestinal", modify
    label def primarydiag_num 4 "Neurological", modify
    label def primarydiag_num 6 "Respiratory", modify

  • #2
    Hello - I would be so grateful if anyone was able to give me any thoughts or advice on this question!

    Comment


    • #3
      I'm not sure I fully understand your objective but have you considered beta-binomial (beta-logistic) models?
      Code:
      help betabin

      Comment


      • #4
        I imagine that part of the challenge you are having with gllamm is that you have a complex model. You may consider slowly building up to see if there are particular predictors that are causing problems in combination.

        In terms of alternatives to gllamm, a helpful thread on cross-validated suggests running a mixed effect logistic model (melogit in Stata) and then use margins to get the estimates of interest as risk ratios. The user-written Stata program, adjrr, by Norton et al. (2013), which calculates risk ratios after logit, probit, ordinal, and multinomial models unfortunately does not work with mixed effects models. However, looking at their ado file, the following code should get you close to what you want (note that margins integrates the random effects by default):
        Code:
        webuse bangladesh, clear
        
        ** Model
        melogit c_use i.urban age child* || district:
        eststo melog1 
        
        ** Categorical predictor
        margins urban, post 
        *Adjusted Risk Ration (ARR)
        nlcom _b[1.urban]/_b[0.urban]
        *Adjusted Risk Difference (ARD)
        nlcom _b[1.urban] - _b[0.urban]
        
        ** Continuous predictor, must use at() option in margins
        margins, at(age = (-7.56 6.44)) post 
        *Adjusted Risk Ration (ARR)
        nlcom _b[2._at]/_b[1._at]
        *Adjusted Risk Difference (ARD)
        nlcom _b[2._at] - _b[1._at]

        Comment


        • #5
          Thank you so much! This seems to work.

          My only follow on question is how do you get p values for each individual effect estimate? They seem to all be p <0.001
          This is the code I used

          melogit picu_mort i.cpi_poprank_quintile i.ethnic_cat i.age_cat i.sex i.season i.primarydiag_num i.epoch || patientid:
          eststo melog1
          ** Categorical predictor
          margins cpi_poprank_quintile, post
          *Adjusted Risk Ration (ARR)
          nlcom _b[2.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]
          nlcom _b[3.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]
          nlcom _b[4.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]
          nlcom _b[5.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]


          and output.
          . margins cpi_poprank_quintile, post
          Predictive margins Number of obs = 183,001
          Model VCE: OIM
          Expression: Marginal predicted mean, predict()
          Delta-method
          Margin std. err. z P>z [95% conf. interval]
          cpi_poprank_quintile
          1 .0335608 .0011022 30.45 0.000 .0314005 .0357212
          2 .0337029 .001013 33.27 0.000 .0317175 .0356883
          3 .035219 .000939 37.51 0.000 .0333787 .0370594
          4 .0353053 .0008471 41.68 0.000 .033645 .0369657
          5 .0370574 .0007785 47.60 0.000 .0355315 .0385832
          . *Adjusted Risk Ration (ARR)
          . nlcom _b[2.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]
          _nl_1: _b[2.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]
          Coefficient Std. err. z P>z [95% conf. interval]
          _nl_1 1.004234 .0443772 22.63 0.000 .9172562 1.091212
          . nlcom _b[3.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]
          _nl_1: _b[3.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]
          Coefficient Std. err. z P>z [95% conf. interval]
          _nl_1 1.049409 .0442355 23.72 0.000 .9627092 1.136109
          . nlcom _b[4.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]
          _nl_1: _b[4.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]
          Coefficient Std. err. z P>z [95% conf. interval]
          _nl_1 1.05198 .042838 24.56 0.000 .9680195 1.135941
          . nlcom _b[5.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]
          _nl_1: _b[5.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]
          Coefficient Std. err. z P>z [95% conf. interval]
          _nl_1 1.104185 .0435616 25.35 0.000 1.018806 1.189565

          Comment


          • #6
            Looking at the adjrr ado file, they report the ARR coefficient estimate, standard error, and p-value from margins. However, for the 95% CIs, they do an additional calculation based on the log of the ARR. The ARD coefficient, standard error, p-vlaue, and 95% CIs come straight from the margins call. See below:
            Code:
            webuse bangladesh, clear
            ** Model
            melogit c_use i.urban age child* || district:
            eststo melog1 
            
            ** Categorical predictor
            margins urban, post 
            *Adjusted Risk Ratio (ARR) - use margins coeff and se 
            nlcom _b[1.urban]/_b[0.urban]
            *Adj Risk Ratio (ARR) - 95% CIs, use lnARR
            nlcom (lnARR: ln(_b[1.urban] / _b[0.urban])), post
            di "lower 95% CI = " exp(_b[lnARR] - invnorm(0.975)*_se[lnARR])
            di "upper 95% CI = " exp(_b[lnARR] + invnorm(0.975)*_se[lnARR])
            
            *Adjusted Risk Difference (ARD) - use margins coeff, SE, and 95% CIs
            est rest melog1
            margins urban, post
            nlcom _b[1.urban] - _b[0.urban]
            
            ** Continuous predictor, must use at() option in margins
            est rest melog1
            margins, at(age = (-7.56 6.44)) post 
            *Adjusted Risk Ration (ARR)
            nlcom _b[2._at]/_b[1._at]
            nlcom (lnARR: ln(_b[2._at] / _b[1._at])), post
            di "lower 95% CI =" exp(_b[lnARR] - invnorm(0.975)*_se[lnARR])
            di "upper 95% CI =" exp(_b[lnARR] + invnorm(0.975)*_se[lnARR])
            
            *Adjusted Risk Difference (ARD)
            est rest melog1
            margins, at(age = (-7.56 6.44)) post
            nlcom _b[2._at] - _b[1._at]

            Comment


            • #7
              Originally posted by hannah mitchell View Post
              I am am trying to find the right code to do a mixed effect binomial regression to give the relative risk of an outcome . . .my outcome is death and the same patient is captured multiple times in the dataset with only the final admission having the potential to result in death so I want to account for this.
              Do all of the patients have the same specified observation interval? (That is, patient outcome was recorded—either alive or dead—as of a fixed elapsed time after the index event.)

              If not, then perhaps a time-to-event (survival) model would be more suitable than a binomial model with a log link function.

              There is a connection between survival and binomial regression models (at least with logistic or complementary log-log link functions—see here for example), but unless you have a uniform follow-up interval for all patients, then it might be worth considering something beyond a relative-risk regression model.

              Comment

              Working...
              X