bloated estimates and CIs in regression of relative survival

Dino Gibertoni

Join Date: Mar 2014
Posts: 29

bloated estimates and CIs in regression of relative survival

24 May 2019, 08:26

hi all
I am using Paul Lambert's stpm2 to regress relative survival on a set of covariates.
Before running stpm2, I looked for the model with the better fit, which turned out to be the PO1, equally considering AIC and BIC

Scale = hazards
df = 1, AIC = 1718.84 BIC = 1732.86
df = 2, AIC = 1715.60 BIC = 1733.13
df = 3, AIC = 1716.78 BIC = 1737.81
df = 4, AIC = 1715.98 BIC = 1740.52
df = 5, AIC = 1717.58 BIC = 1745.62

Scale = odds
df = 1, AIC = 1716.48 BIC = 1730.50
df = 2, AIC = 1715.82 BIC = 1733.34
df = 3, AIC = 1716.54 BIC = 1737.57
df = 4, AIC = 1716.06 BIC = 1740.59
df = 5, AIC = 1717.70 BIC = 1745.74

Scale = normal
df = 1, AIC = 1783.05 BIC = 1797.07
df = 2, AIC = 1784.63 BIC = 1802.16
df = 3, AIC = 1785.71 BIC = 1806.74
df = 4, AIC = 1785.55 BIC = 1810.09
df = 5, AIC = 1787.16 BIC = 1815.21

As a result of the multiple regression, I get extremely high values for exp(b) and for CIs of the variable comCV, which is a 0/1 indicator of the presence of cardiovascular comorbidities:

Code:

. stpm2 females age ib2.BMI_class1 CKD_EPIv2 nph1 nph2 nph3 nph7 com_CV diabete2 hb2 K2 slope_GFR dur_
> dial, ///
>         bhazard(rate) df(1) scale(odds) nolog eform baselevels

Log likelihood = -742.92797                     Number of obs     =        835

------------------------------------------------------------------------------
             |     exp(b)   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
xb           |
     females |   .8044153   .2003005    -0.87   0.382     .4937759    1.310481
         age |    1.03572   .0268045     1.36   0.175     .9844942    1.089611
             |
  BMI_class1 |
     <24.99  |   1.461836   .5295387     1.05   0.295     .7187164    2.973307
   25-29.99  |   .8533482   .2756105    -0.49   0.623     .4531157    1.607102
   30-34.99  |          1  (base)
        ≥35  |   1.492883   .5579428     1.07   0.284     .7176324     3.10563
             |
   CKD_EPIv2 |    1.04822   .0165166     2.99   0.003     1.016343    1.081097
        nph1 |   3.097226    1.63313     2.14   0.032     1.101907    8.705649
        nph2 |   1.681237   1.414288     0.62   0.537     .3232824    8.743305
        nph3 |   1.612484   .8053305     0.96   0.339     .6058649    4.291558
        nph7 |   3.399568   1.993952     2.09   0.037     1.076881    10.73198
      com_CV |   59.39088   146.0529     1.66   0.097     .4791397    7361.688
    diabete2 |   2.201047   .6498246     2.67   0.008     1.234029    3.925845
         hb2 |   .8611074   .0779225    -1.65   0.098     .7211593    1.028214
          K2 |   .8274406   .1500999    -1.04   0.296     .5798647     1.18072
   slope_GFR |   .6025818   .0285791   -10.68   0.000     .5490924    .6612817
    dur_dial |   1.000125   .0002395     0.52   0.601      .999656    1.000595
       _rcs1 |   7.125679   .9170236    15.26   0.000     5.537104    9.170009
       _cons |   .0001439   .0005008    -2.54   0.011     1.56e-07    .1322831
------------------------------------------------------------------------------
Note: Estimates are transformed only in the first equation.
.

I tried running the multiple regression switching to PH1 and PH2 models and still get very high coefficients. Instead if I use the normal scale, results are much more acceptable

Code:

. stpm2 females age ib2.BMI_class1 CKD_EPIv2 nph1 nph2 nph3 nph7 com_CV diabete2 hb2 K2 slope_GFR dur_
> dial, ///
>         bhazard(rate) df(1) scale(normal) nolog eform baselevels

Log likelihood = -810.10208                     Number of obs     =        835

------------------------------------------------------------------------------
             |     exp(b)   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
xb           |
     females |   .9484795   .1314391    -0.38   0.703     .7228854    1.244476
         age |   1.000844   .0145023     0.06   0.954     .9728194    1.029675
             |
  BMI_class1 |
     <24.99  |   1.129578   .2418444     0.57   0.569     .7424597    1.718541
   25-29.99  |   .9432495   .1671652    -0.33   0.742     .6664604    1.334992
   30-34.99  |          1  (base)
        ≥35  |   1.072004   .2201216     0.34   0.735     .7168262    1.603169
             |
   CKD_EPIv2 |   1.023801   .0087958     2.74   0.006     1.006706    1.041186
        nph1 |   2.137752   .6897906     2.35   0.019     1.135793    4.023605
        nph2 |   1.694423   .7632828     1.17   0.242     .7007825     4.09695
        nph3 |   1.514186    .467945     1.34   0.179     .8262725    2.774821
        nph7 |   2.124199   .7612974     2.10   0.036     1.052279    4.288048
      com_CV |     4.6894   3.607231     2.01   0.045     1.038358    21.17813
    diabete2 |   1.327321   .2200226     1.71   0.088     .9591289    1.836855
         hb2 |   .9517406   .0484481    -0.97   0.331     .8613673    1.051596
          K2 |   .8995876    .092316    -1.03   0.302     .7356867    1.100004
   slope_GFR |    .792338   .0213394    -8.64   0.000     .7515983    .8352859
    dur_dial |   1.000105   .0001313     0.80   0.423     .9998479    1.000363
       _rcs1 |   2.742701   .2441501    11.33   0.000     2.303595    3.265508
       _cons |    .027567   .0484567    -2.04   0.041     .0008794       .8642
------------------------------------------------------------------------------
Note: Estimates are transformed only in the first equation.

com_CV is certainly highly associated with relative survival, but PO and PH estimates are clearly unrealistic. Maybe the problem lies in the unbalanced distribution of comorbidities and deaths, as you see below. Given these results, I am inclined to use the regression in the normal scale, although it is far by being the fittest. Can anyone recommend a better remedy?

Code:

. tab died com_CV, row chi


           |     (max) com_CV
      Died |         0          1 |     Total
-----------+----------------------+----------
         0 |       207        383 |       590 
           |     35.08      64.92 |    100.00 
-----------+----------------------+----------
         1 |        17        229 |       246 
           |      6.91      93.09 |    100.00 
-----------+----------------------+----------
     Total |       224        612 |       836 
           |     26.79      73.21 |    100.00

thank you very much for your attention

Tags: None

Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

28 May 2019, 15:02

You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stat output, and sample data using dataex.

With a user-written procedure, getting response depends on someone active on the list using the procedure. You may need to contact the author.
Comment

Announcement

bloated estimates and CIs in regression of relative survival

Comment