Advice on interpreting categorical X categorical interactions in logistic regression model (and using margins)

Amal Khanolkar

Join Date: Feb 2015
Posts: 146

Advice on interpreting categorical X categorical interactions in logistic regression model (and using margins)

26 Aug 2022, 02:26

Hi All

I would like some advice on interpreting categorical X categorical interactions in a logistic regression model that also includes adjustment for other categorical predictors. I've tried to make sense of the interactions using by estimating probabilities using margins.

The logistic regression model:

Code:

logistic seekhelp i.sexethnic##i.comorbid sex i.incomeq

Above, the outcome is 'seekhelp': 0=those who do not seek medical help and 1=those who do

My main predictors are 'sexethnic' which indicates an individuals sexual and ethnic identities with four categories and 'comorbid' which is binary that, 0=no comorbidity and 1=having comorbidity.

. logistic selfharm i.sexethnic##i.comorbid sex i.incomeq

Code:

Logistic regression                                     Number of obs =  9,030
                                                        LR chi2(12)   = 852.35
                                                        Prob > chi2   = 0.0000
Log likelihood = -4414.4308                             Pseudo R2     = 0.0880

------------------------------------------------------------------------------------
          seekhelp | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
-------------------+----------------------------------------------------------------
         sexethnic |
         White-SM  |   3.640788   .2425844    19.39   0.000     3.195068    4.148687
  EM-heterosexual  |   .5976129   .0555956    -5.53   0.000     .4980044    .7171446
            EM-SM  |   2.752875    .407243     6.85   0.000      2.05999    3.678814
                   |
        1.comorbid |   3.145764   .3903478     9.24   0.000     2.466626     4.01189
                   |
sexethnic#comorbid |
       White-SM#1  |   .6745556   .1313252    -2.02   0.043     .4605771    .9879459
EM-heterosexual#1  |   .8079153   .2667794    -0.65   0.518     .4229538    1.543259
          EM-SM#1  |   .5862236    .253474    -1.24   0.217     .2511984    1.368074
                   |
               sex |   1.469779   .0798558     7.09   0.000      1.32131    1.634931
                   |
          incomeq3 |
                2  |   1.036387   .0871094     0.43   0.671     .8789776    1.221987
                3  |   .8280044   .0718018    -2.18   0.030     .6985851    .9813998
                4  |   .8287136   .0724647    -2.15   0.032     .6981895    .9836387
                5  |   .8039571   .0693916    -2.53   0.011     .6788339     .952143
                   |
             _cons |   .1301035   .0139224   -19.06   0.000     .1054877    .1604635
------------------------------------------------------------------------------------

If I want to understand the interactions terms in the model above, I run margins to estimate predicted probabilities:

Code:

margins (sexethnic#comorbid)

Code:

. margins (sexethnic#comorbid)

Predictive margins                                       Number of obs = 9,030
Model VCE: OIM

Expression: Pr(selfharm), predict()

---------------------------------------------------------------------------------------
                      |            Delta-method
                      |     Margin   std. err.      z    P>|z|     [95% conf. interval]
----------------------+----------------------------------------------------------------
   sexethnic#comorbid |
White-Heterosexual#0  |   .1737261   .0052282    33.23   0.000     .1634791    .1839731
White-Heterosexual#1  |   .3954843   .0280088    14.12   0.000     .3405881    .4503805
          White-SM#0  |   .4304796   .0135683    31.73   0.000     .4038861     .457073
          White-SM#1  |   .6138357   .0329269    18.64   0.000        .5493    .6783713
   EM-heterosexual#0  |   .1119266   .0083708    13.37   0.000       .09552    .1283331
   EM-heterosexual#1  |   .2414332   .0536803     4.50   0.000     .1362217    .3466447
             EM-SM#0  |   .3644274   .0327756    11.12   0.000     .3001885    .4286663
             EM-SM#1  |    .512133   .0962549     5.32   0.000     .3234768    .7007892
---------------------------------------------------------------------------------------

If I understand the above correctly, for each possible combination between the two interacted variables, we have estimated the proportion of individuals who have the outcome seekhelp (i.e. coded as 1).
For example, in the White-heterosexual#normal category, 17% of participants with no comorbidity (coded 0) had the outcome to seek help, which goes up to 39% in those with comorbidity (coded as '1').

Similarly, in the EM-SM and normal category (i.e. no comorbidity), 36% of participants had the outcome seekhelp which increases to 51% if they have comorbidity.
Have I interpreted the above correctly?

Further, if these are proportions (or percentages), then for the EM-SM group above, 36% with no comorbidity have the outcome seekhelp (thus 64% do not have the outcome?). That is, each margin presented is a percentage or proportion of 100?

The 'take home' message here is that across all 8 categories of 'sexethnic', having comorbidity (coded as '1') increases odds for to seek help.

Hope the above makes sense!

Thanks
/Amal

Tags: None

Amal Khanolkar

Join Date: Feb 2015

Posts: 146
#2

29 Aug 2022, 06:33

Just wondering if someone could cross-check my interpretation above?

Thanks!
/Amal
Comment
Erik Ruzek

Join Date: Oct 2017

Posts: 461
#3

29 Aug 2022, 08:50

Your interpretations seem correct, however, I am confused why in the margins output it says

Code:

Expression: Pr(selfharm), predict()

when your outcome in the original logistic regression model was seekhelp. Was this margins command run directly after the logistic model you presented first?
Comment
Amal Khanolkar

Join Date: Feb 2015

Posts: 146
#4

29 Aug 2022, 08:59

Hi Erik

Thanks for the confirmation!

And good spotting - I changed the name of the variable from selfharm to seekhelp and back again - but copied the outputs from two separate regressions - but the margins output is indeed based on the same logistic regression model!

So, predicted probabilities are always interpreted as percentages (or the easier way to interpret them)?

Thanks
/Amal
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17851
#5

29 Aug 2022, 09:26

Amal:
"...proportion is the natural estimate of the probability of an event" (se page 86 in https://global.oup.com/academic/prod...cc=it&lang=en&)

Kind regards,
Carlo
(Stata 19.0)
Comment
Erik Ruzek

Join Date: Oct 2017

Posts: 461
#6

29 Aug 2022, 11:31

Not a problem. Just to add a bit to Carlo's post, I would suggest thinking of these in terms of a probability of having a 1 on the outcome vs. a 0. So the White-SM#1 group has the highest probability of the examined groups with a probability of self harm estimated as .61(.03) adjusting for the other predictors in the model. For a really good primer on understanding and interpreting margins, see Rich Williams' excellent Stata Journal article on the topic.

Originally posted by Amal Khanolkar View Post

Hi Erik

Thanks for the confirmation!

And good spotting - I changed the name of the variable from selfharm to seekhelp and back again - but copied the outputs from two separate regressions - but the margins output is indeed based on the same logistic regression model!

So, predicted probabilities are always interpreted as percentages (or the easier way to interpret them)?

Thanks
/Amal

Last edited by Erik Ruzek; 29 Aug 2022, 11:34.
1 like
Comment

Announcement

Advice on interpreting categorical X categorical interactions in logistic regression model (and using margins)

Comment

Comment

Comment

Comment

Comment