Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using lincom with logistic regression and two categorical variables

    Hello
    I am a novice STATA user.
    I have a simple regression model looking at the effect of a binary variable (gender) and a categorical variable (age group (1=<65 2=65-74, 3=75+) on a dependent binary variable "A". I am not sure how to interpret my output from the following command and whether I should use lincom afterwards to calculate the ORs for the interaction terms or if I should just use the existing output?

    logistic A i.agegroup i.gender i.agegroup#i.gender


    Logistic regression Number of obs = 20919
    LR chi2(6) = 421.81
    Prob > chi2 = 0.0000
    Log likelihood = -12102.948 Pseudo R2 = 0.0171



    agegroup3 |
    2 | 1.618443 .074152 10.51 0.000 1.479442 1.770503
    3 | 2.237386 .1203289 14.97 0.000 2.01355 2.486105
    |
    1.gender | 1.410541 .0937332 5.18 0.000 1.238288 1.606755
    |
    agegroup3#gender |
    2#1 | .7326569 .0705965 -3.23 0.001 .6065709 .8849519
    3#1 | .7091284 .0663833 -3.67 0.000 .5902577 .8519381


    The ORs for the interaction terms are lower than the independent variables and I was expecting the ORs to be >1, why is that? I tested whether the interaction term was significant using a post estimation LR test and it was.

    Should I now use lincom to calculate the ORs for the interaction terms separately?

    Many thanks everyone



  • #2
    The coefficients (or odds ratios) shown for the interaction terms in the regression output are the incremental contribution of the interaction term over and above the "main effect" term. So, yes, -lincom- will get you there.

    But it is probably simpler and less error prone to use -margins agegroup3#gender, predict(xb)- to get a table of the log-odds-ratios for each combination of agegroup3 and gender, and then you can exponentiate those to get the odds ratios themselves.

    Comment


    • #3
      This slight tweak of Clyde's advise may save a step:

      Code:
      webuse nhanes2f,clear
      logistic diabetes i.agegrp i.sex i.agegrp#i.sex
      margins agegrp#sex, expression(exp(predict(xb)))
      Incidentally, Kate, your original post would be much easier to read if you used code tags. See pt. 12 of the FAQ.
      -------------------------------------------
      Richard Williams, Notre Dame Dept of Sociology
      StataNow Version: 19.5 MP (2 processor)

      EMAIL: [email protected]
      WWW: https://www3.nd.edu/~rwilliam

      Comment


      • #4
        Yes, of course, Richard's is better than mine. Somehow the expression() option slipped my mind!

        Comment


        • #5
          Clyde-
          Unless I am interpreting something wrong, the above code gives predicted probablites (margins), not log OR (your code) or OR (Richards code).

          Code:
          . webuse nhanes2f,clear
          
          . logistic diabetes i.agegrp i.sex i.agegrp#i.sex
          
          Logistic regression                             Number of obs     =     10,335
                                                          LR chi2(11)       =     352.94
                                                          Prob > chi2       =     0.0000
          Log likelihood = -1822.5986                     Pseudo R2         =     0.0883
          
          ----------------------------------------------------------------------------------
                  diabetes | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
          -----------------+----------------------------------------------------------------
                    agegrp |
                 age30-39  |   1.087351   .8320444     0.11   0.913     .2426767     4.87205
                 age40-49  |   7.488196   4.203287     3.59   0.000     2.492179    22.49962
                 age50-59  |    15.6344   8.330455     5.16   0.000     5.502208    44.42482
                 age60-69  |   24.36647   12.44328     6.25   0.000     8.955863    66.29454
                  age 70+  |   36.59737   19.11003     6.89   0.000     13.15159    101.8407
                           |
                       sex |
                   Female  |   2.563279    1.50044     1.61   0.108     .8138359    8.073371
                           |
                agegrp#sex |
          age30-39#Female  |   2.157879   1.848788     0.90   0.369     .4024856    11.56921
          age40-49#Female  |   .5948949   .3976777    -0.78   0.437     .1604818    2.205234
          age50-59#Female  |   .4282047    .271563    -1.34   0.181     .1235458    1.484139
          age60-69#Female  |   .4192236   .2519118    -1.45   0.148     .1291094    1.361236
           age 70+#Female  |   .3650181     .22603    -1.63   0.104     .1084489     1.22858
                           |
                     _cons |   .0035971   .0018018   -11.24   0.000     .0013477    .0096011
          ----------------------------------------------------------------------------------
          Note: _cons estimates baseline odds.
          
          . margins agegrp#sex, expression(exp(predict(xb)))
          
          Adjusted predictions                            Number of obs     =     10,335
          Model VCE    : OIM
          
          Expression   : exp(predict(xb))
          
          ----------------------------------------------------------------------------------
                           |            Delta-method
                           |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
          -----------------+----------------------------------------------------------------
                agegrp#sex |
            age20-29#Male  |   .0035971   .0018018     2.00   0.046     .0000657    .0071286
          age20-29#Female  |   .0092205   .0027929     3.30   0.001     .0037466    .0146944
            age30-39#Male  |   .0039113   .0022626     1.73   0.084    -.0005233     .008346
          age30-39#Female  |   .0216346   .0051542     4.20   0.000     .0115326    .0317366
            age40-49#Male  |    .026936   .0068241     3.95   0.000      .013561     .040311
          age40-49#Female  |   .0410742   .0082191     5.00   0.000     .0249651    .0571834
            age50-59#Male  |    .056239   .0102175     5.50   0.000     .0362131    .0762649
          age50-59#Female  |   .0617284   .0100568     6.14   0.000     .0420173    .0814394
            age60-69#Male  |   .0876494   .0087156    10.06   0.000     .0705671    .1047317
          age60-69#Female  |   .0941869   .0087083    10.82   0.000      .077119    .1112548
             age 70+#Male  |   .1316456   .0194205     6.78   0.000     .0935821     .169709
           age 70+#Female  |   .1231733   .0169947     7.25   0.000     .0898642    .1564823
          ----------------------------------------------------------------------------------
          
          .

          Comment


          • #6
            I'm not sure which code you're referring to, as there are several different suggestions in the thread at this point.

            My original suggestion was -margins agegrp#sex, predict(xb)- which would, in fact, give log odds. Richard suggested -margins agegrp#sex, expression(exp(predict(xb))- which would in fact give odds. To get probabilities, just use -margins agegrp#sex- without options, since probability is the default output from -margins- after logit.

            Comment


            • #7
              Originally posted by Clyde Schechter View Post
              The coefficients (or odds ratios) shown for the interaction terms in the regression output are the incremental contribution of the interaction term over and above the "main effect" term. So, yes, -lincom- will get you there.

              But it is probably simpler and less error prone to use -margins agegroup3#gender, predict(xb)- to get a table of the log-odds-ratios for each combination of agegroup3 and gender, and then you can exponentiate those to get the odds ratios themselves.

              I was referring to the statement above.

              My apologies, if this seems like nitpicking, or if I interpreted your words incorrectly, but I just wanted to make sure that I understand and interpret these numbers correctly. It is obvious that we can use these numbers to estimate the odds ratios.

              Would it be correct that if we wanted to estimate the Odds of diabetes among females wrt males, within 20-29 age group we would divide 0 .0092205 by 0.0035971 to get an Odds Ratio of 2.56?

              Comment


              • #8
                Yes.

                Comment


                • #9
                  Hi Clyde,
                  I am following on this old post with a related question. I was reading the following link about interactions in logistic regression and came across this code to get the odds via the margins command.
                  Deciphering Interactions in Logistic Regression (ucla.edu)

                  Code:
                  margins, over(f h) at(cv1=50) expression(exp(xb())) noatlegend
                  Predictive margins Number of obs = 200 Model VCE : OIM Expression : exp(xb()) over : f h ------------------------------------------------------------------------------ | Delta-method | Margin Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- f h | 0 0 | .1304264 .0734908 1.77 0.076 -.0136129 .2744657 0 1 | 1.424706 .515989 2.76 0.006 .4133857 2.436025 1 0 | 2.609533 1.136545 2.30 0.022 .3819457 4.837121 1 1 | 3.677847 1.311463 2.80 0.005 1.107427 6.248267 ------------------------------------------------------------------------------
                  The option expression(exp(xb())) insures that we are looking at results in the odds ratio metric. The baseline odds are now .1304264 which is reasonable. We will compute the odds ratio for each level of f.
                  odds ratio 1 at f=0: 1.424706/.1304264 = 10.923446 odds ratio 2 at f=1: 3.677847/2.609533 = 1.4093889
                  My question is how do I get confidence interval for these odds ratios calculated in the example illustrated above. I am conducting a similar analysis where I have 2x5 category interaction term and I am interested in estimating the ORs for all categories compared to the 00 reference.

                  this margins code
                  Code:
                  margins, over(time  race)  expression(exp(xb())) noatlegend
                  gives me odds for each of the 10 interaction strata and I can use those odds to estimate the ORS compared to 00 category but how do I estimate the cofindnce interval for those ORs.
                  00
                  01
                  02
                  03
                  04
                  10
                  11
                  12
                  13
                  14

                  Additional note: I am using these margins postestimation code following a ologit regression. I am assuming the estimation of proportional odds would be similar.

                  Thanks

                  Ashar
                  Last edited by ashar ata; 12 Aug 2024, 18:31.

                  Comment


                  • #10
                    Well, you could get the ORs and confidence intervals by adding a -post- option to your -margins- command and then using -nlcom- to get the odds ratios.

                    But this is unnecessarily complicated. Just read out the odds ratios from the -ologit- output directly and don't bother with -margins-. Sure, if you want the odds themselves, you are best off using -margins- for that. But the odds ratios are right there in the -ologit- output and there is no reason to go through a complicated set of calculations to get them via -margins- and -nlcom-.

                    Comment

                    Working...
                    X