Mixed Effects Binomial Regression

hannah mitchell

Join Date: Mar 2018

Posts: 18
#1

Mixed Effects Binomial Regression

13 Aug 2024, 21:10

Hello,

I am am trying to find the right code to do a mixed effect binomial regression to give the relative risk of an outcome, but I am struggling to find the right mixed effect model.
I essentially want binreg picu_mort i.cpi_poprank_quintile i.ethnic_cat i.age_cat i.sex i.season i.primarydiag_num c.log_pim2 i.epoch, rr with a random intercept by patientid. This is because my outcome is death and the same patient is captured multiple times in the dataset with only the final admission having the potential to result in death so I want to account for this.

I unsuccessfully tried the following work arounds
gllamm picu_mort i.cpi_poprank_quintile i.ethnic_cat i.age_cat i.sex i.season i.primarydiag_num i.epoch, rr i(patientid) fam(binomial) link(log)
factor-variable and time-series operators not allowed

xi: gllamm picu_mort i.cpi_poprank_quintile i.ethnic_cat i.age_cat i.sex i.season i.primarydiag_num i.epoch log_pim2, i(patientid) fam(binomial) link(log)
i.cpi_poprank~e _Icpi_popra_1-5 (naturally coded; _Icpi_popra_1 omitted)
i.ethnic_cat _Iethnic_ca_1-5 (naturally coded; _Iethnic_ca_1 omitted)
i.age_cat _Iage_cat_1-5 (naturally coded; _Iage_cat_1 omitted)
i.sex _Isex_1-2 (naturally coded; _Isex_1 omitted)
i.season _Iseason_1-2 (naturally coded; _Iseason_1 omitted)
i.primarydiag~m _Iprimarydi_1-6 (naturally coded; _Iprimarydi_1 omitted)
i.epoch _Iepoch_1-3 (naturally coded; _Iepoch_1 omitted)
estimates diverging

From doing some reading my understanding is that the following code mixed effects generalised linear model with poisson family gives an approximation of relative risk
meglm picu_mort i.cpi_poprank_quintile i.ethnic_cat i.age_cat i.sex i.season i.primarydiag_num c.log_pim2 i.epoch || patientid:, family(poisson) link(log) eform
However this does not give RR as an output but IRR

I would be really grateful for any advice on whether this is the best approach or if there are any other alternative methods for making a mixed effects binomial regression model.

Many thanks

Hannah

Code:

* Example generated by -dataex-. For more info, type help dataex clear input float(picu_mort cpi_poprank_quintile ethnic_cat age_cat sex season) long primarydiag_num float(log_pim2 epoch) long patientid 0 2 1 3 2 1 2 -3.5771534 1 11 0 2 1 3 2 1 2 -1.1086035 1 11 0 . 4 5 1 2 4 -2.7487965 2 18 0 . 4 5 1 2 4 -5.116813 2 18 0 . 4 5 1 1 6 -1.9823108 2 18 1 . 4 5 1 1 4 -3.421944 2 18 0 5 1 2 1 1 6 -4.244526 1 21 0 2 1 2 2 1 1 -3.6874056 1 31 0 . 1 5 2 1 6 -3.5771534 1 42 0 2 1 2 1 1 1 -3.039826 1 52 end label values picu_mort picu_mort label def picu_mort 0 "alive", modify label def picu_mort 1 "dead", modify label values ethnic_cat ethnic_cat label def ethnic_cat 1 "Any White", modify label def ethnic_cat 4 "Any Black", modify label values age_cat age_cat label def age_cat 2 "29 days to 1 year", modify label def age_cat 3 "1-4 years", modify label def age_cat 5 "11-15 years", modify label values sex sex label def sex 1 "Male", modify label def sex 2 "Female", modify label values season season label def season 1 "Winter", modify label def season 2 "Summer", modify label values primarydiag_num primarydiag_num label def primarydiag_num 1 "Cardiovascular", modify label def primarydiag_num 2 "Gastrointestinal", modify label def primarydiag_num 4 "Neurological", modify label def primarydiag_num 6 "Respiratory", modify
Tags: None
hannah mitchell

Join Date: Mar 2018

Posts: 18
#2

19 Aug 2024, 06:52

Hello - I would be so grateful if anyone was able to give me any thoughts or advice on this question!
Comment
John Mullahy

Join Date: Dec 2016

Posts: 743
#3

20 Aug 2024, 07:23

I'm not sure I fully understand your objective but have you considered beta-binomial (beta-logistic) models?

Code:

help betabin
Comment
Erik Ruzek

Join Date: Oct 2017

Posts: 419
#4

20 Aug 2024, 13:56

I imagine that part of the challenge you are having with gllamm is that you have a complex model. You may consider slowly building up to see if there are particular predictors that are causing problems in combination.

In terms of alternatives to gllamm, a helpful thread on cross-validated suggests running a mixed effect logistic model (melogit in Stata) and then use margins to get the estimates of interest as risk ratios. The user-written Stata program, adjrr, by Norton et al. (2013), which calculates risk ratios after logit, probit, ordinal, and multinomial models unfortunately does not work with mixed effects models. However, looking at their ado file, the following code should get you close to what you want (note that margins integrates the random effects by default):

Code:

webuse bangladesh, clear ** Model melogit c_use i.urban age child* || district: eststo melog1 ** Categorical predictor margins urban, post *Adjusted Risk Ration (ARR) nlcom _b[1.urban]/_b[0.urban] *Adjusted Risk Difference (ARD) nlcom _b[1.urban] - _b[0.urban] ** Continuous predictor, must use at() option in margins margins, at(age = (-7.56 6.44)) post *Adjusted Risk Ration (ARR) nlcom _b[2._at]/_b[1._at] *Adjusted Risk Difference (ARD) nlcom _b[2._at] - _b[1._at]
Comment

hannah mitchell

Join Date: Mar 2018
Posts: 18

31 Aug 2024, 12:05

Thank you so much! This seems to work.

My only follow on question is how do you get p values for each individual effect estimate? They seem to all be p <0.001
This is the code I used

melogit picu_mort i.cpi_poprank_quintile i.ethnic_cat i.age_cat i.sex i.season i.primarydiag_num i.epoch || patientid:
eststo melog1
** Categorical predictor
margins cpi_poprank_quintile, post
*Adjusted Risk Ration (ARR)
nlcom _b[2.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]
nlcom _b[3.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]
nlcom _b[4.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]
nlcom _b[5.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]

and output.

. margins cpi_poprank_quintile, post

Predictive margins Number of obs = 183,001

Model VCE: OIM

Expression: Marginal predicted mean, predict()

Delta-method

Margin std. err. z P>z [95% conf. interval]

cpi_poprank_quintile

1 .0335608 .0011022 30.45 0.000 .0314005 .0357212

2 .0337029 .001013 33.27 0.000 .0317175 .0356883

3 .035219 .000939 37.51 0.000 .0333787 .0370594

4 .0353053 .0008471 41.68 0.000 .033645 .0369657

5 .0370574 .0007785 47.60 0.000 .0355315 .0385832

. *Adjusted Risk Ration (ARR)

. nlcom _b[2.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]

_nl_1: _b[2.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]

Coefficient Std. err. z P>z [95% conf. interval]

_nl_1 1.004234 .0443772 22.63 0.000 .9172562 1.091212

. nlcom _b[3.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]

_nl_1: _b[3.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]

Coefficient Std. err. z P>z [95% conf. interval]

_nl_1 1.049409 .0442355 23.72 0.000 .9627092 1.136109

. nlcom _b[4.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]

_nl_1: _b[4.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]

Coefficient Std. err. z P>z [95% conf. interval]

_nl_1 1.05198 .042838 24.56 0.000 .9680195 1.135941

. nlcom _b[5.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]

_nl_1: _b[5.cpi_poprank_quintile]/_b[1.cpi_poprank_quintile]

Coefficient Std. err. z P>z [95% conf. interval]

_nl_1 1.104185 .0435616 25.35 0.000 1.018806 1.189565

Comment

Erik Ruzek

Join Date: Oct 2017
Posts: 419

31 Aug 2024, 12:59

Looking at the adjrr ado file, they report the ARR coefficient estimate, standard error, and p-value from margins. However, for the 95% CIs, they do an additional calculation based on the log of the ARR. The ARD coefficient, standard error, p-vlaue, and 95% CIs come straight from the margins call. See below:

Code:

webuse bangladesh, clear
** Model
melogit c_use i.urban age child* || district:
eststo melog1 

** Categorical predictor
margins urban, post 
*Adjusted Risk Ratio (ARR) - use margins coeff and se 
nlcom _b[1.urban]/_b[0.urban]
*Adj Risk Ratio (ARR) - 95% CIs, use lnARR
nlcom (lnARR: ln(_b[1.urban] / _b[0.urban])), post
di "lower 95% CI = " exp(_b[lnARR] - invnorm(0.975)*_se[lnARR])
di "upper 95% CI = " exp(_b[lnARR] + invnorm(0.975)*_se[lnARR])

*Adjusted Risk Difference (ARD) - use margins coeff, SE, and 95% CIs
est rest melog1
margins urban, post
nlcom _b[1.urban] - _b[0.urban]

** Continuous predictor, must use at() option in margins
est rest melog1
margins, at(age = (-7.56 6.44)) post 
*Adjusted Risk Ration (ARR)
nlcom _b[2._at]/_b[1._at]
nlcom (lnARR: ln(_b[2._at] / _b[1._at])), post
di "lower 95% CI =" exp(_b[lnARR] - invnorm(0.975)*_se[lnARR])
di "upper 95% CI =" exp(_b[lnARR] + invnorm(0.975)*_se[lnARR])

*Adjusted Risk Difference (ARD)
est rest melog1
margins, at(age = (-7.56 6.44)) post
nlcom _b[2._at] - _b[1._at]

Comment

Joseph Coveney

Join Date: Apr 2014

Posts: 4377
#7

01 Sep 2024, 00:27

Originally posted by hannah mitchell View Post

I am am trying to find the right code to do a mixed effect binomial regression to give the relative risk of an outcome . . .my outcome is death and the same patient is captured multiple times in the dataset with only the final admission having the potential to result in death so I want to account for this.

Do all of the patients have the same specified observation interval? (That is, patient outcome was recorded—either alive or dead—as of a fixed elapsed time after the index event.)

If not, then perhaps a time-to-event (survival) model would be more suitable than a binomial model with a log link function.

There is a connection between survival and binomial regression models (at least with logistic or complementary log-log link functions—see here for example), but unless you have a uniform follow-up interval for all patients, then it might be worth considering something beyond a relative-risk regression model.
Comment

Announcement