Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Advice on interpreting categorical X categorical interactions in logistic regression model (and using margins)

    Hi All

    I would like some advice on interpreting categorical X categorical interactions in a logistic regression model that also includes adjustment for other categorical predictors. I've tried to make sense of the interactions using by estimating probabilities using margins.

    The logistic regression model:

    Code:
    logistic seekhelp i.sexethnic##i.comorbid sex i.incomeq
    Above, the outcome is 'seekhelp': 0=those who do not seek medical help and 1=those who do

    My main predictors are 'sexethnic' which indicates an individuals sexual and ethnic identities with four categories and 'comorbid' which is binary that, 0=no comorbidity and 1=having comorbidity.

    . logistic selfharm i.sexethnic##i.comorbid sex i.incomeq

    Code:
    Logistic regression                                     Number of obs =  9,030
                                                            LR chi2(12)   = 852.35
                                                            Prob > chi2   = 0.0000
    Log likelihood = -4414.4308                             Pseudo R2     = 0.0880
    
    ------------------------------------------------------------------------------------
              seekhelp | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
    -------------------+----------------------------------------------------------------
             sexethnic |
             White-SM  |   3.640788   .2425844    19.39   0.000     3.195068    4.148687
      EM-heterosexual  |   .5976129   .0555956    -5.53   0.000     .4980044    .7171446
                EM-SM  |   2.752875    .407243     6.85   0.000      2.05999    3.678814
                       |
            1.comorbid |   3.145764   .3903478     9.24   0.000     2.466626     4.01189
                       |
    sexethnic#comorbid |
           White-SM#1  |   .6745556   .1313252    -2.02   0.043     .4605771    .9879459
    EM-heterosexual#1  |   .8079153   .2667794    -0.65   0.518     .4229538    1.543259
              EM-SM#1  |   .5862236    .253474    -1.24   0.217     .2511984    1.368074
                       |
                   sex |   1.469779   .0798558     7.09   0.000      1.32131    1.634931
                       |
              incomeq3 |
                    2  |   1.036387   .0871094     0.43   0.671     .8789776    1.221987
                    3  |   .8280044   .0718018    -2.18   0.030     .6985851    .9813998
                    4  |   .8287136   .0724647    -2.15   0.032     .6981895    .9836387
                    5  |   .8039571   .0693916    -2.53   0.011     .6788339     .952143
                       |
                 _cons |   .1301035   .0139224   -19.06   0.000     .1054877    .1604635
    ------------------------------------------------------------------------------------


    If I want to understand the interactions terms in the model above, I run margins to estimate predicted probabilities:

    Code:
    margins (sexethnic#comorbid)
    Code:
    . margins (sexethnic#comorbid)
    
    Predictive margins                                       Number of obs = 9,030
    Model VCE: OIM
    
    Expression: Pr(selfharm), predict()
    
    ---------------------------------------------------------------------------------------
                          |            Delta-method
                          |     Margin   std. err.      z    P>|z|     [95% conf. interval]
    ----------------------+----------------------------------------------------------------
       sexethnic#comorbid |
    White-Heterosexual#0  |   .1737261   .0052282    33.23   0.000     .1634791    .1839731
    White-Heterosexual#1  |   .3954843   .0280088    14.12   0.000     .3405881    .4503805
              White-SM#0  |   .4304796   .0135683    31.73   0.000     .4038861     .457073
              White-SM#1  |   .6138357   .0329269    18.64   0.000        .5493    .6783713
       EM-heterosexual#0  |   .1119266   .0083708    13.37   0.000       .09552    .1283331
       EM-heterosexual#1  |   .2414332   .0536803     4.50   0.000     .1362217    .3466447
                 EM-SM#0  |   .3644274   .0327756    11.12   0.000     .3001885    .4286663
                 EM-SM#1  |    .512133   .0962549     5.32   0.000     .3234768    .7007892
    ---------------------------------------------------------------------------------------



    If I understand the above correctly, for each possible combination between the two interacted variables, we have estimated the proportion of individuals who have the outcome seekhelp (i.e. coded as 1).
    For example, in the White-heterosexual#normal category, 17% of participants with no comorbidity (coded 0) had the outcome to seek help, which goes up to 39% in those with comorbidity (coded as '1').

    Similarly, in the EM-SM and normal category (i.e. no comorbidity), 36% of participants had the outcome seekhelp which increases to 51% if they have comorbidity.
    Have I interpreted the above correctly?

    Further, if these are proportions (or percentages), then for the EM-SM group above, 36% with no comorbidity have the outcome seekhelp (thus 64% do not have the outcome?). That is, each margin presented is a percentage or proportion of 100?

    The 'take home' message here is that across all 8 categories of 'sexethnic', having comorbidity (coded as '1') increases odds for to seek help.

    Hope the above makes sense!

    Thanks
    /Amal

  • #2
    Just wondering if someone could cross-check my interpretation above?

    Thanks!
    /Amal

    Comment


    • #3
      Your interpretations seem correct, however, I am confused why in the margins output it says
      Code:
      Expression: Pr(selfharm), predict()
      when your outcome in the original logistic regression model was seekhelp. Was this margins command run directly after the logistic model you presented first?

      Comment


      • #4
        Hi Erik

        Thanks for the confirmation!

        And good spotting - I changed the name of the variable from selfharm to seekhelp and back again - but copied the outputs from two separate regressions - but the margins output is indeed based on the same logistic regression model!

        So, predicted probabilities are always interpreted as percentages (or the easier way to interpret them)?

        Thanks
        /Amal

        Comment


        • #5
          Amal:
          "...proportion is the natural estimate of the probability of an event" (se page 86 in https://global.oup.com/academic/prod...cc=it&lang=en&)
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            Not a problem. Just to add a bit to Carlo's post, I would suggest thinking of these in terms of a probability of having a 1 on the outcome vs. a 0. So the White-SM#1 group has the highest probability of the examined groups with a probability of self harm estimated as .61(.03) adjusting for the other predictors in the model. For a really good primer on understanding and interpreting margins, see Rich Williams' excellent Stata Journal article on the topic.

            Originally posted by Amal Khanolkar View Post
            Hi Erik

            Thanks for the confirmation!

            And good spotting - I changed the name of the variable from selfharm to seekhelp and back again - but copied the outputs from two separate regressions - but the margins output is indeed based on the same logistic regression model!

            So, predicted probabilities are always interpreted as percentages (or the easier way to interpret them)?

            Thanks
            /Amal
            Last edited by Erik Ruzek; 29 Aug 2022, 11:34.

            Comment

            Working...
            X