Some advice on categorical interactions in logistic regression models and predicted probabilities, and using 'testparm'

Amal Khanolkar

Join Date: Feb 2015
Posts: 146

Some advice on categorical interactions in logistic regression models and predicted probabilities, and using 'testparm'

11 Oct 2022, 08:47

Dear All

I need some advice on certain statistical concepts and hoping someone here can provide some direct answers.

I've run a logistic regression model which includes interactions between two categorical variables. I've used Stata's margins command to make sense of the interactions - and the results are both intriguing and interesting (and also according to our hypothesis):

Code:

 
 svy: logistic seekhelp i.sexethnic##i.comorbid sex i.incomeq   
 margins (sexethnic#comorbid)

Above, the outcome is 'seekhelp': 0=those who do not seek medical help and 1=those who do My main predictors are 'sexethnic' which indicates an individuals sexual and ethnic identities with four categories and 'comorbid' which is binary that, 0=no comorbidity and 1=having comorbidity.

Code:

  
 Logistic regression                                     Number of obs =  9,030                                                         LR chi2(12)   = 852.35                                                         Prob > chi2   = 0.0000 Log likelihood = -4414.4308                             Pseudo R2     = 0.0880  ------------------------------------------------------------------------------------           seekhelp | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval] -------------------+----------------------------------------------------------------          sexethnic |          White-SM  |   3.640788   .2425844    19.39   0.000     3.195068    4.148687   EM-heterosexual  |   .5976129   .0555956    -5.53   0.000     .4980044    .7171446             EM-SM  |   2.752875    .407243     6.85   0.000      2.05999    3.678814                    |         1.comorbid |   3.145764   .3903478     9.24   0.000     2.466626     4.01189                    | sexethnic#comorbid |        White-SM#1  |   .6745556   .1313252    -2.02   0.043     .4605771    .9879459 EM-heterosexual#1  |   .8079153   .2667794    -0.65   0.518     .4229538    1.543259           EM-SM#1  |   .5862236    .253474    -1.24   0.217     .2511984    1.368074                    |                sex |   1.469779   .0798558     7.09   0.000      1.32131    1.634931                    |           incomeq3 |                 2  |   1.036387   .0871094     0.43   0.671     .8789776    1.221987                 3  |   .8280044   .0718018    -2.18   0.030     .6985851    .9813998                 4  |   .8287136   .0724647    -2.15   0.032     .6981895    .9836387                 5  |   .8039571   .0693916    -2.53   0.011     .6788339     .952143                    |              _cons |   .1301035   .0139224   -19.06   0.000     .1054877    .1604635 ------------------------------------------------------------------------------------

I used testparm to check for joint effects - the p-value was <0.001

If I understand correctly, testparm is a joint test to assess whether all coefficients associated with the interaction of factor variables 'sexethnic' and 'comorbid' are equal to 0. Since the p<0.05, this tells us that the model with interactions is better than the model without? And hence I should retain the interactions?
Further, what exactly is a joint test?

Further, while I'm not too concerned about statistical significance of the interactions in the model above, the predicted probabilities indicates that the model with interactions makes more sense (theoretically) than the model without interactions.

However, reviewers are ''insisting'' that I need to discuss model fit - that the model with interactions is significantly better than the one without. Generally I would use a likelihood ratio test, but this does not work with survey data.
Thus, is testparm sufficient to say that the model with interactions is better (along with the fact that predicted probabilities from this model also makes sense)?

The predicted probabilities below in case of interest:

. margins (sexethnic#comorbid)

Code:

Predictive margins                                       Number of obs = 9,030 Model VCE: OIM  Expression: Pr(seekhelp), predict()  ---------------------------------------------------------------------------------------                       |            Delta-method                       |     Margin   std. err.      z    P>|z|     [95% conf. interval] ----------------------+----------------------------------------------------------------    sexethnic#comorbid | White-Heterosexual#0  |   .1737261   .0052282    33.23   0.000     .1634791    .1839731 White-Heterosexual#1  |   .3954843   .0280088    14.12   0.000     .3405881    .4503805           White-SM#0  |   .4304796   .0135683    31.73   0.000     .4038861     .457073           White-SM#1  |   .6138357   .0329269    18.64   0.000        .5493    .6783713    EM-heterosexual#0  |   .1119266   .0083708    13.37   0.000       .09552    .1283331    EM-heterosexual#1  |   .2414332   .0536803     4.50   0.000     .1362217    .3466447              EM-SM#0  |   .3644274   .0327756    11.12   0.000     .3001885    .4286663              EM-SM#1  |    .512133   .0962549     5.32   0.000     .3234768    .7007892 ---------------------------------------------------------------------------------------

Thanks!

/Amal

Tags: None

Amal Khanolkar

Join Date: Feb 2015
Posts: 146

11 Oct 2022, 08:58

Apologies something went wrong while using the wrap code function above. Logistic model and predicted probabilities reposted below:

Code:

. eststo: svy: logit seekhelp i.sexethnic##i.comorbid sex i.incomeq, or
(running logit on estimation sample)

Survey: Logistic regression

Number of strata =   9                            Number of obs   =      9,030
Number of PSUs   = 398                            Population size = 8,285.3391
                                                  Design df       =        389
                                                  F(12, 378)      =      16.28
                                                  Prob > F        =     0.0000

---------------------------------------------------------------------------------------------------------
                                        |             Linearized
                               seekhelp | Odds ratio   std. err.      t    P>|t|     [95% conf. interval]
----------------------------------------+----------------------------------------------------------------
                              sexethnic |
                              White-SM  |   3.022686   .3451914     9.69   0.000     2.414806    3.783588
                       EM-heterosexual  |   .5788723   .1029235    -3.07   0.002     .4081002    .8211051
                                 EM-SM  |   1.740775   .3757014     2.57   0.011     1.138831    2.660883
                                        |
                               comorbid |
                comorbid BMI & poor MH  |   2.833383   .4781054     6.17   0.000     2.033411    3.948075
                                        |
                     sexethnic#comorbid |
       White-SM#comorbid BMI & poor MH  |   .7013723   .2547167    -0.98   0.329     .3434423    1.432331
EM-heterosexual#comorbid BMI & poor MH  |    .772266   .3947025    -0.51   0.613     .2827241    2.109458
          EM-SM#comorbid BMI & poor MH  |   .8924436   .3665347    -0.28   0.782     .3980095    2.001097
                                        |
                                    sex |   1.210408   .1188497     1.94   0.053     .9979105    1.468155
                                        |
                               incomeq3 |
                                     2  |   .9305574   .1443799    -0.46   0.643     .6859042    1.262475
                                     3  |   .8733454   .1463635    -0.81   0.420     .6281862    1.214182
                                     4  |   .8148758   .1419995    -1.17   0.241      .578494    1.147847
                                     5  |   .8627282   .1373705    -0.93   0.354     .6308352    1.179864
                                        |
                                  _cons |    .192911    .043298    -7.33   0.000      .124083    .2999173
---------------------------------------------------------------------------------------------------------
Note: _cons estimates baseline odds.
(est1 stored)

. 
end of do-file

. do "C:\Users\sejjak4\AppData\Local\Temp\STD33a8_000000.tmp"

. margins (sexethnic#comorbid)

Predictive margins

Number of strata =   9                            Number of obs   =      9,030
Number of PSUs   = 398                            Population size = 8,285.3391
Model VCE: Linearized                             Design df       =        389

Expression: Pr(seekhelp), predict()

------------------------------------------------------------------------------------------------------------
                                           |            Delta-method
                                           |     Margin   std. err.      t    P>|t|     [95% conf. interval]
-------------------------------------------+----------------------------------------------------------------
                        sexethnic#comorbid |
White-Heterosexual#Normal BMI & normal MH  |   .1885143   .0103066    18.29   0.000     .1682506    .2087779
White-Heterosexual#comorbid BMI & poor MH  |   .3962476   .0381367    10.39   0.000     .3212678    .4712274
          White-SM#Normal BMI & normal MH  |   .4117666   .0222885    18.47   0.000     .3679457    .4555875
          White-SM#comorbid BMI & poor MH  |   .5812066   .0800421     7.26   0.000     .4238374    .7385757
   EM-heterosexual#Normal BMI & normal MH  |   .1186384   .0162973     7.28   0.000     .0865967    .1506801
   EM-heterosexual#comorbid BMI & poor MH  |   .2272572   .0773894     2.94   0.004     .0751033    .3794111
             EM-SM#Normal BMI & normal MH  |   .2876653   .0420611     6.84   0.000     .2049697    .3703609
             EM-SM#comorbid BMI & poor MH  |   .5044761   .0992502     5.08   0.000     .3093422      .69961
------------------------------------------------------------------------------------------------------------

Announcement

Some advice on categorical interactions in logistic regression models and predicted probabilities, and using 'testparm'

Comment