Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • addressing proportional hazard in a Cox model

    I am analyzing the survival of a group of respiratory patients. One of my variables is baseline CPI, a composite index of respiratory function (the lower, the better).
    data is in this format:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int id byte sex double age byte smoking float(cpi dead surv_yrs)
    372 1 73.8630136986301 1 56.08594 1 4.7342467
    373 1 66.8328767123288 0  60.6701 1 2.1150684
    374 1 84.4602739726027 0 46.87331 0 1.1315068
    375 0 70.5150684931507 1 51.72653 1 1.7041095
    376 0  76.227397260274 0 53.59892 1  1.241096
    377 0 73.1260273972603 1 49.97387 1  5.493151
    378 0 74.6739726027397 0 53.25577 1 2.0164382
    379 1 88.2520547945206 1 46.58249 1  4.986301
    380 1 78.8027397260274 0  58.7373 1 1.1589041
    381 0 81.4054794520548 0 44.01292 0  5.419178
    382 1 60.3095890410959 1  63.4892 1  4.446575
    383 0 70.3945205479452 0 48.29292 0  4.526027
    384 1 85.5260273972603 1 50.03133 0  5.156164
    385 0 75.0986301369863 0  57.7133 1 1.6136986
    386 1 64.8109589041096 0  69.9197 1 .59178084
    387 1 74.3150684931507 1  34.1793 0  3.660274
    388 0 67.4931506849315 1 41.70098 1   2.79726
    389 1 68.3643835616438 1  63.0056 1  2.832877
    390 1 81.3589041095891 1  45.1427 1       3.4
    391 0 82.1835616438356 0 59.78088 1 2.1452055
    end
    label values sex sex
    label def sex 0 "F", modify
    label def sex 1 "M", modify
    cpi predicts survival in a Cox model, but it violates the proportional hazard assumption:
    Code:
    stset surv_yrs, fail(dead) id(id) exit(time 8)
    
    stcox cpi sex age, nolog
    
    Failure _d: dead
    Analysis time _t: surv_rs
    Exit on or before: time 8
    ID variable: id
    
    Cox regression with Breslow method for ties
    
    No. of subjects =        825    Number of obs =    825
    No. of failures =        649
    Time at risk    = 2,802.7699
        LR chi2(4)    = 341.86
    Log likelihood = -3709.1995    Prob > chi2   = 0.0000
    
        
    _t         Haz. ratio   Std. err.      z          P>z      [95% conf. interval]
        
     cpi  |   1.075482   .0047164    16.59   0.000     1.066278    1.084766
     sex |   1.07819     .1083775     0.75   0.454     .8853897    1.312975
     age |   1.021587   .0054431     4.01   0.000     1.010974    1.032311
    
     estat phtest, detail
    
    Test of proportional-hazards    assumption
    
    Time function: Analysis time
                
                 rho         chi2      df    Prob>chi2
                
    cpi    -0.30377    63.05    1    0.0000
    sex     0.02194    0.31    1    0.5765
    age     0.01998    0.29    1    0.5894
                
    Global test         63.43    3    0.0000
    Looking at the survival by different quintiles of CPI, it seems that the problem is that the effect of lower values (less severe) of baseline CPI is shifted to the right compared to the higher ones, which makes sense
    here are the curves of survival and the log-log plot:
    Code:
    xtile q_cpi= cpi, nq(5)
    stcox i.q_cpi sex age_at  if dis==1, nolog
    
          
    _t    Haz. ratio   Std. err.    z     P>z     [95% conf.    interval]
            
    q_cpi
    2     1.507044    .2330814    2.65    0.008     1.112956    2.040676
    3     2.681361    .396617     6.67    0.000     2.006544    3.583124
    4     4.550416    .66173     10.42    0.000     3.421903    6.051101
    5     7.866322   1.174407    13.82    0.000     5.870717    10.54029
    
    sex   1.05997    .1068369     0.58    0.563     .8699591    1.291481
    age   1.022326    .005499     4.11    0.000     1.011605    1.033161
            
     stcurve, survival at(q_cpi==1) at(q_cpi==2) at(q_cpi==3) at(q_cpi==4) at(q_cpi==5) legend(position(6) col(5))
    
    stphplot, by(q_cpi) legend(position(6) col(5))



    Indeed, testing separate periods, the parallel hazard assumption is respected between 2 and 4 years, but not before or after

    I could easily stop the analysis at 4 years (it is an interval long enough for a baseline predictor), but can't find a way to adjust the previous period. I tried building interactions with time at different ponts ( _t<1 or _t <2 etc) , as suggested in " An Introduction to Survival Analysis Using Stata" (possibly outdated), but cannot resolve. I also looked to alternatives to Cox regression in "Flexible Parametric Survival Analysis Using Stata: Beyond the Cox Model", but although this shouldn't be an uncommon problem, I could not find a clear explanation on which would be the best way to address the problem. I suspect that splines could be used, but, again, I can't find a good tutorial for this case. Maybe the solution is so easy that I can't see it?

    Any help would be appreciated
    Last edited by Piersante Sestini; 09 Nov 2023, 03:22.
Working...
X