Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • bloated estimates and CIs in regression of relative survival

    hi all
    I am using Paul Lambert's stpm2 to regress relative survival on a set of covariates.
    Before running stpm2, I looked for the model with the better fit, which turned out to be the PO1, equally considering AIC and BIC

    Scale = hazards
    df = 1, AIC = 1718.84 BIC = 1732.86
    df = 2, AIC = 1715.60 BIC = 1733.13
    df = 3, AIC = 1716.78 BIC = 1737.81
    df = 4, AIC = 1715.98 BIC = 1740.52
    df = 5, AIC = 1717.58 BIC = 1745.62

    Scale = odds
    df = 1, AIC = 1716.48 BIC = 1730.50
    df = 2, AIC = 1715.82 BIC = 1733.34
    df = 3, AIC = 1716.54 BIC = 1737.57
    df = 4, AIC = 1716.06 BIC = 1740.59
    df = 5, AIC = 1717.70 BIC = 1745.74

    Scale = normal
    df = 1, AIC = 1783.05 BIC = 1797.07
    df = 2, AIC = 1784.63 BIC = 1802.16
    df = 3, AIC = 1785.71 BIC = 1806.74
    df = 4, AIC = 1785.55 BIC = 1810.09
    df = 5, AIC = 1787.16 BIC = 1815.21



    As a result of the multiple regression, I get extremely high values for exp(b) and for CIs of the variable comCV, which is a 0/1 indicator of the presence of cardiovascular comorbidities:


    Code:
    . stpm2 females age ib2.BMI_class1 CKD_EPIv2 nph1 nph2 nph3 nph7 com_CV diabete2 hb2 K2 slope_GFR dur_
    > dial, ///
    >         bhazard(rate) df(1) scale(odds) nolog eform baselevels
    
    Log likelihood = -742.92797                     Number of obs     =        835
    
    ------------------------------------------------------------------------------
                 |     exp(b)   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    xb           |
         females |   .8044153   .2003005    -0.87   0.382     .4937759    1.310481
             age |    1.03572   .0268045     1.36   0.175     .9844942    1.089611
                 |
      BMI_class1 |
         <24.99  |   1.461836   .5295387     1.05   0.295     .7187164    2.973307
       25-29.99  |   .8533482   .2756105    -0.49   0.623     .4531157    1.607102
       30-34.99  |          1  (base)
            ≥35  |   1.492883   .5579428     1.07   0.284     .7176324     3.10563
                 |
       CKD_EPIv2 |    1.04822   .0165166     2.99   0.003     1.016343    1.081097
            nph1 |   3.097226    1.63313     2.14   0.032     1.101907    8.705649
            nph2 |   1.681237   1.414288     0.62   0.537     .3232824    8.743305
            nph3 |   1.612484   .8053305     0.96   0.339     .6058649    4.291558
            nph7 |   3.399568   1.993952     2.09   0.037     1.076881    10.73198
          com_CV |   59.39088   146.0529     1.66   0.097     .4791397    7361.688
        diabete2 |   2.201047   .6498246     2.67   0.008     1.234029    3.925845
             hb2 |   .8611074   .0779225    -1.65   0.098     .7211593    1.028214
              K2 |   .8274406   .1500999    -1.04   0.296     .5798647     1.18072
       slope_GFR |   .6025818   .0285791   -10.68   0.000     .5490924    .6612817
        dur_dial |   1.000125   .0002395     0.52   0.601      .999656    1.000595
           _rcs1 |   7.125679   .9170236    15.26   0.000     5.537104    9.170009
           _cons |   .0001439   .0005008    -2.54   0.011     1.56e-07    .1322831
    ------------------------------------------------------------------------------
    Note: Estimates are transformed only in the first equation.
    .

    I tried running the multiple regression switching to PH1 and PH2 models and still get very high coefficients. Instead if I use the normal scale, results are much more acceptable

    Code:
    . stpm2 females age ib2.BMI_class1 CKD_EPIv2 nph1 nph2 nph3 nph7 com_CV diabete2 hb2 K2 slope_GFR dur_
    > dial, ///
    >         bhazard(rate) df(1) scale(normal) nolog eform baselevels
    
    Log likelihood = -810.10208                     Number of obs     =        835
    
    ------------------------------------------------------------------------------
                 |     exp(b)   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    xb           |
         females |   .9484795   .1314391    -0.38   0.703     .7228854    1.244476
             age |   1.000844   .0145023     0.06   0.954     .9728194    1.029675
                 |
      BMI_class1 |
         <24.99  |   1.129578   .2418444     0.57   0.569     .7424597    1.718541
       25-29.99  |   .9432495   .1671652    -0.33   0.742     .6664604    1.334992
       30-34.99  |          1  (base)
            ≥35  |   1.072004   .2201216     0.34   0.735     .7168262    1.603169
                 |
       CKD_EPIv2 |   1.023801   .0087958     2.74   0.006     1.006706    1.041186
            nph1 |   2.137752   .6897906     2.35   0.019     1.135793    4.023605
            nph2 |   1.694423   .7632828     1.17   0.242     .7007825     4.09695
            nph3 |   1.514186    .467945     1.34   0.179     .8262725    2.774821
            nph7 |   2.124199   .7612974     2.10   0.036     1.052279    4.288048
          com_CV |     4.6894   3.607231     2.01   0.045     1.038358    21.17813
        diabete2 |   1.327321   .2200226     1.71   0.088     .9591289    1.836855
             hb2 |   .9517406   .0484481    -0.97   0.331     .8613673    1.051596
              K2 |   .8995876    .092316    -1.03   0.302     .7356867    1.100004
       slope_GFR |    .792338   .0213394    -8.64   0.000     .7515983    .8352859
        dur_dial |   1.000105   .0001313     0.80   0.423     .9998479    1.000363
           _rcs1 |   2.742701   .2441501    11.33   0.000     2.303595    3.265508
           _cons |    .027567   .0484567    -2.04   0.041     .0008794       .8642
    ------------------------------------------------------------------------------
    Note: Estimates are transformed only in the first equation.

    com_CV is certainly highly associated with relative survival, but PO and PH estimates are clearly unrealistic. Maybe the problem lies in the unbalanced distribution of comorbidities and deaths, as you see below. Given these results, I am inclined to use the regression in the normal scale, although it is far by being the fittest. Can anyone recommend a better remedy?

    Code:
    . tab died com_CV, row chi
    
    
               |     (max) com_CV
          Died |         0          1 |     Total
    -----------+----------------------+----------
             0 |       207        383 |       590 
               |     35.08      64.92 |    100.00 
    -----------+----------------------+----------
             1 |        17        229 |       246 
               |      6.91      93.09 |    100.00 
    -----------+----------------------+----------
         Total |       224        612 |       836 
               |     26.79      73.21 |    100.00

    thank you very much for your attention

  • #2
    You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stat output, and sample data using dataex.

    With a user-written procedure, getting response depends on someone active on the list using the procedure. You may need to contact the author.

    Comment

    Working...
    X