I am analyzing summarized Medicaid data (N = 1,102,479) and using Stata 17.0. I am looking at various co-occurring mental health and substance use disorder variables coded as binary in relation to a 5-category variable that captures increasing levels of opioid use and misuse. I have been running separate mlogit models for each disorder. Because I want to control for demographic covariates, I am including in the model a small set of demographic predictors (e.g., race/ethnicity, gender, age, etc.).
Running the models on the raw data and requesting relative risk ratios produces meaning coefficients for each predictor (i.e., all values greater than 0). Here is a snippet of the output:
Because two of the covariates race/ethnicity and marital status have a fair amount of missing data (missing race = 38,880 and missing marital status = 137,727), I decided to run a sensitivity analysis using multiple imputation to impute values for these variables followed by re-running the same mlogit models and then compare the results. Here is the setup code for the multiple imputation and following mlogit model:
Running this code, however, produces output with negative coefficients for some of the predictors, which I don't believe is possible for risk ratios. In fact, whether I run with or without the rrr option specified, I seem to get the same results. It is as if the request for risk ratios is being ignored. Here is a snippet of the output for a model where I included fewer predictors but requested risk ratios just to test (I get the same problem with the full or reduced set):
Question - Am i setting this up right and if so, how can there be negative coefficients? Thanks for any help.
Best,
James Swartz
Code:
*** Run multinomial Logistic Regression by Co-occurring Disorder ***
foreach var of varlist alchl_flag tbcco_flag cnnbs_flag hllgn_flag inhlnts_flag ///
sha_flag stmlnts_ccne_flag stmlnts_othr_flag mh_dsrdr_anxty mh_dsrdr_adhd mh_dsrdr_atsm ///
mh_dsrdr_bplr mh_dsrdr_dprssn mh_dsrdr_intllctl mh_dsrdr_prsnlty mh_dsrdr_ptsd mh_dsrdr_schz {
mlogit cat_misuse5 i.(`var' male race_ethnic mrtl_stus_cd vet_ind ctznshp_ind) age, rrr nolog
}
Running the models on the raw data and requesting relative risk ratios produces meaning coefficients for each predictor (i.e., all values greater than 0). Here is a snippet of the output:
HTML Code:
No_analgesic_prescription (base outcome) Prescription___no_misuse stmlnts_ccne_flag Present 1.272366 .0669816 4.58 0.000 1.14763 1.410659 1.male .7743273 .0063945 -30.97 0.000 .7618953 .7869622 race_ethnic NH-Black .8001621 .0072233 -24.70 0.000 .7861292 .8144455 Hispanic .6107236 .0082195 -36.64 0.000 .5948242 .6270479 Other .4590468 .0119958 -29.80 0.000 .4361274 .4831706 mrtl_stus_cd Legally separated 1.229824 .0496552 5.12 0.000 1.136253 1.331101 Divorced 1.128576 .0167569 8.15 0.000 1.096207 1.161902 Separated 1.282765 .0307162 10.40 0.000 1.223953 1.344402 Widower .985869 .0328658 -0.43 0.669 .9235126 1.052436 Never married/partnered .9272318 .010228 -6.85 0.000 .9074004 .9474967
Because two of the covariates race/ethnicity and marital status have a fair amount of missing data (missing race = 38,880 and missing marital status = 137,727), I decided to run a sensitivity analysis using multiple imputation to impute values for these variables followed by re-running the same mlogit models and then compare the results. Here is the setup code for the multiple imputation and following mlogit model:
Code:
*** Rerun using multiple imputation to impute age and marital status ***
mi set flong
mi register imputed race_ethnic mrtl_stus
mi register regular cat_misuse5 age alchl_flag tbcco_flag cnnbs_flag ///
mh_dsrdr_bplr stmlnts_ccne_flag mh_dsrdr_dprssn mh_dsrdr_anxty ///
mh_dsrdr_schz mh_dsrdr_ptsd
mi impute chained (mlogit) race_ethnic mrtl_stus = I.(cat_misuse5 alchl_flag tbcco_flag cnnbs_flag ///
stmlnts_ccne_flag mh_dsrdr_bplr mh_dsrdr_dprssn mh_dsrdr_anxty ///
mh_dsrdr_schz mh_dsrdr_ptsd) c.age, add(5)
mi estimate: mlogit cat_misuse5 i.(alchl_flag male race_ethnic mrtl_stus vet_ind ctznshp_ind) age, rrr nolog
Code:
. mi estimate : mlogit cat_misuse5 ib(first).alchl_flag age ib(first).male, rrr
HTML Code:
cat_misuse5 Coefficient Std. err. t P>t [95% conf. interval] No_analgesic_prescription (base outcome) Prescription___no_misuse alchl_flag Present .4387507 .023464 18.70 0.000 .3927622 .4847393 age .0127149 .000277 45.90 0.000 .012172 .0132578 1.male -.2652595 .0076297 -34.77 0.000 -.2802135 -.2503055 _cons -2.856731 .0118297 -241.49 0.000 -2.879916 -2.833545
Best,
James Swartz

Comment