Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interpret interactions in cox regression

    Hi,

    I am having some problem in trying to understand the difference between two parameterizations of the cox regression


    Code:
    use https://www.stata.com/data/jwooldridge/eacsap/recid, clear
    
    gen fail = 1 - cens
    
    stset durat, failure(fail)
    
    stcox i.alcohol##c.educ, base
    
    margins i.alcohol, at(educ=(0))
    margins i.alcohol, at(educ=(1))
    
    stcox i.alcohol#c.educ, base

    Please could you show me how I can obtain the results presented in

    Code:
    stcox i.alcohol#c.educ, base
    
    --------------------------------------------------------------------------------
                _t | Haz. ratio   Std. err.      z    P>|z|     [95% conf. interval]
    ---------------+----------------------------------------------------------------
    alcohol#c.educ |
                0  |   .9840041   .0166281    -0.95   0.340     .9519475     1.01714
                1  |   1.011573   .0186091     0.63   0.532     .9757497    1.048712
    --------------------------------------------------------------------------------
    from the model presented in

    Code:
    stcox i.alcohol##c.educ, base
    
    --------------------------------------------------------------------------------
                _t | Haz. ratio   Std. err.      z    P>|z|     [95% conf. interval]
    ---------------+----------------------------------------------------------------
           alcohol |
                0  |          1  (base)
                1  |   .7489855   .2760267    -0.78   0.433     .3637279    1.542305
                   |
              educ |   .9757365   .0194557    -1.23   0.218     .9383395    1.014624
                   |
    alcohol#c.educ |
                1  |   1.057145   .0389876     1.51   0.132     .9834275    1.136389
    --------------------------------------------------------------------------------
    I understand that using # and ## is the same model but reparameterise differently. However, I cannot understand how I can obtain the estimates that are presented in the model # using the model estimates ##

    Basically, I would like to know how I can obtain the effect of years of education amongst drinks and non drinks. I think this is what the model
    Code:
     stcox i.alcohol#c.educ, base
    . However, I don't understand how I can derive those results from the model
    Code:
     stcox i.alcohol##c.educ, base
    Thank you in advance for your help

    Andrew
    Last edited by Andrew Xavier; 08 Aug 2022, 13:33.

  • #2
    As you show neither the output you are trying to interpret, nor example data with which to create them from the code you showed, I'll illustrate the general phenomenon with an example from the auto.dta dataset.

    Code:
    . sysuse auto, clear
    (1978 automobile data)
    
    . regress price i.foreign##c.mpg
    
          Source |       SS           df       MS      Number of obs   =        74
    -------------+----------------------------------   F(3, 70)        =      9.48
           Model |   183435281         3  61145093.6   Prob > F        =    0.0000
        Residual |   451630115        70  6451858.79   R-squared       =    0.2888
    -------------+----------------------------------   Adj R-squared   =    0.2584
           Total |   635065396        73  8699525.97   Root MSE        =    2540.1
    
    -------------------------------------------------------------------------------
            price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    --------------+----------------------------------------------------------------
          foreign |
         Foreign  |  -13.58741   2634.664    -0.01   0.996    -5268.258    5241.084
              mpg |  -329.2551   74.98545    -4.39   0.000    -478.8088   -179.7013
                  |
    foreign#c.mpg |
         Foreign  |   78.88826   112.4812     0.70   0.485    -145.4485     303.225
                  |
            _cons |   12600.54   1527.888     8.25   0.000     9553.261    15647.81
    -------------------------------------------------------------------------------
    
    . regress price i.foreign#c.mpg
    
          Source |       SS           df       MS      Number of obs   =        74
    -------------+----------------------------------   F(2, 71)        =     14.42
           Model |   183435109         2  91717554.6   Prob > F        =    0.0000
        Residual |   451630287        71  6360989.96   R-squared       =    0.2888
    -------------+----------------------------------   Adj R-squared   =    0.2688
           Total |   635065396        73  8699525.97   Root MSE        =    2522.1
    
    -------------------------------------------------------------------------------
            price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    --------------+----------------------------------------------------------------
    foreign#c.mpg |
        Domestic  |  -329.0368   61.46843    -5.35   0.000    -451.6014   -206.4723
         Foreign  |  -250.7077   51.21966    -4.89   0.000    -352.8368   -148.5786
                  |
            _cons |   12595.97   1235.936    10.19   0.000     10131.58    15060.35
    -------------------------------------------------------------------------------
    Notice that the coefficient of mpg in the ## model is the same as the Domestic#c.mpg coefficient in the # model. That is not a coincidence: the coefficient of your continuous variable in the ## will equal that of the base level of the discrete variable # continuous variable domain.

    Then notice that the Foreign#c.mpg coefficient in the # model is equal to the mpg coefficient in the ## model PLUS the Foreign#c.mpg coefficient (also from the ## model).

    That is how you can crosswalk between the two models. It is, of course, easier to just let -margins- do the work for you in practice.

    Now, in a Cox regression it works the same way if we look at regression coefficients. But the default output from a Cox regression in Stata is not the regression coefficients but the hazard ratios. For hazard ratios, it will work similarly except that where addition was used with coefficients, multiplication must be used for the hazard ratios. (coefficient = log(hr)).

    Comment


    • #3
      Dear Clyde,

      Thank you for your response.

      I have edited my post.

      Looking at the cox regression output could you show me how I obtain the estimates from the model # using the model ##

      Thanks

      Andrew

      Comment


      • #4
        Originally posted by Clyde Schechter View Post
        As you show neither the output you are trying to interpret, nor example data with which to create them from the code you showed, I'll illustrate the general phenomenon with an example from the auto.dta dataset.

        Code:
        . sysuse auto, clear
        (1978 automobile data)
        
        . regress price i.foreign##c.mpg
        
        Source | SS df MS Number of obs = 74
        -------------+---------------------------------- F(3, 70) = 9.48
        Model | 183435281 3 61145093.6 Prob > F = 0.0000
        Residual | 451630115 70 6451858.79 R-squared = 0.2888
        -------------+---------------------------------- Adj R-squared = 0.2584
        Total | 635065396 73 8699525.97 Root MSE = 2540.1
        
        -------------------------------------------------------------------------------
        price | Coefficient Std. err. t P>|t| [95% conf. interval]
        --------------+----------------------------------------------------------------
        foreign |
        Foreign | -13.58741 2634.664 -0.01 0.996 -5268.258 5241.084
        mpg | -329.2551 74.98545 -4.39 0.000 -478.8088 -179.7013
        |
        foreign#c.mpg |
        Foreign | 78.88826 112.4812 0.70 0.485 -145.4485 303.225
        |
        _cons | 12600.54 1527.888 8.25 0.000 9553.261 15647.81
        -------------------------------------------------------------------------------
        
        . regress price i.foreign#c.mpg
        
        Source | SS df MS Number of obs = 74
        -------------+---------------------------------- F(2, 71) = 14.42
        Model | 183435109 2 91717554.6 Prob > F = 0.0000
        Residual | 451630287 71 6360989.96 R-squared = 0.2888
        -------------+---------------------------------- Adj R-squared = 0.2688
        Total | 635065396 73 8699525.97 Root MSE = 2522.1
        
        -------------------------------------------------------------------------------
        price | Coefficient Std. err. t P>|t| [95% conf. interval]
        --------------+----------------------------------------------------------------
        foreign#c.mpg |
        Domestic | -329.0368 61.46843 -5.35 0.000 -451.6014 -206.4723
        Foreign | -250.7077 51.21966 -4.89 0.000 -352.8368 -148.5786
        |
        _cons | 12595.97 1235.936 10.19 0.000 10131.58 15060.35
        -------------------------------------------------------------------------------
        Notice that the coefficient of mpg in the ## model is the same as the Domestic#c.mpg coefficient in the # model. That is not a coincidence: the coefficient of your continuous variable in the ## will equal that of the base level of the discrete variable # continuous variable domain.

        Then notice that the Foreign#c.mpg coefficient in the # model is equal to the mpg coefficient in the ## model PLUS the Foreign#c.mpg coefficient (also from the ## model).

        That is how you can crosswalk between the two models. It is, of course, easier to just let -margins- do the work for you in practice.

        Now, in a Cox regression it works the same way if we look at regression coefficients. But the default output from a Cox regression in Stata is not the regression coefficients but the hazard ratios. For hazard ratios, it will work similarly except that where addition was used with coefficients, multiplication must be used for the hazard ratios. (coefficient = log(hr)).
        I don't think the estimates are exactly the same. I have edited my code for cox regression to show you the output .

        Please can you show me how I can map the model ## into the model #?

        Thanks

        Comment


        • #5
          Oh, wait, sorry, but the models you are talking about are not equivalent. The model that is equivalent to -stcox i.alcohol##c.educ- is -stcox i.alcohol#c.educ i.alcohol-. Without the i.alcohol "main" effect, they are not the same model. If you look at the full output you will notice that your two original models will have different LR chi2 statistics, with different degrees of freedom--a sure tip-off that they are not the same model.

          If you run -stcox i.alcohol#c.educ i.alcohol-, you will be able to crosswalk the coefficients in the way I described in #2. (Best to run them with the -nohr- option so you see the coefficients: it's easier to add them than to multiply hr's.)

          Comment


          • #6
            Originally posted by Clyde Schechter View Post
            Oh, wait, sorry, but the models you are talking about are not equivalent. The model that is equivalent to -stcox i.alcohol##c.educ- is -stcox i.alcohol#c.educ i.alcohol-. Without the i.alcohol "main" effect, they are not the same model. If you look at the full output you will notice that your two original models will have different LR chi2 statistics, with different degrees of freedom--a sure tip-off that they are not the same model.

            If you run -stcox i.alcohol#c.educ i.alcohol-, you will be able to crosswalk the coefficients in the way I described in #2. (Best to run them with the -nohr- option so you see the coefficients: it's easier to add them than to multiply hr's.)
            Thank you very much for your help

            Comment

            Working...
            X