I'm having problems working out how to internally validate a Cox survival model and would be grateful if someone could let me know how to do it.
One of the previous posts suggested the following (for a single explanatory variable, y (say)) which was, after having performed stcox y, to use the command stcox y, vce(bootstrap , reps (500) seed (123). When i did this I got exactly the same values for the C-statistic in the non-bootstrap and the bootstrap results, and the same value for the hazard ratio, although the command estat bootstrap after the bootstrap defined model also gave a bias estimate. Isn't it necessary to evaluate the optimism? If so , how is this done?.
Below are the (edited) results that i got
. stcox BNP3200
Failure _d: dead==1
Analysis time _t: survivaltime
Cox regression with Breslow method for ties
No. of subjects = 2,042 Number of obs = 2,042
No. of failures = 1,019
Time at risk = 83,941.6503
LR chi2(1) = 90.38
Log likelihood = -6936.5657 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
_t | Haz. ratio Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
BNP3200 | 1.82114 .115614 9.44 0.000 1.608071 2.06244
. estat concordance
Harrell's C = (E + T/2) / P = 0.6014
Somers' D = 0.2028
. stcox BNP3200, vce(bootstrap, reps(500) seed (123))
(running stcox on estimation sample)
Cox regression with Breslow method for ties
Bootstrap results
No. of subjects = 2,042 Number of obs = 2,042
No. of failures = 1,019
Time at risk = 83,941.6503
Wald chi2(1) = 88.31
Log likelihood = -6936.5657 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| Observed Bootstrap Normal-based
_t | haz. ratio std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
BNP3200 | 1.82114 .1161697 9.40 0.000 1.60711 2.063674
------------------------------------------------------------------------------
Note that the coefficient is ln(1.82114) = 0.599462678
. estat bootstrap
Bootstrap results Number of obs = 2,042
Replications = 500
------------------------------------------------------------------------------
| Observed Bootstrap
_t | coefficient Bias std. err. [95% conf. interval]
-------------+----------------------------------------------------------------
BNP3200 | .59946272 .0016671 .06378955 .4845526 .723986 (BC)
------------------------------------------------------------------------------
Key: BC: Bias-corrected
. estat concordance
Harrell's C = (E + T/2) / P = 0.6014
Somers' D = 0.2028
Many thanks
One of the previous posts suggested the following (for a single explanatory variable, y (say)) which was, after having performed stcox y, to use the command stcox y, vce(bootstrap , reps (500) seed (123). When i did this I got exactly the same values for the C-statistic in the non-bootstrap and the bootstrap results, and the same value for the hazard ratio, although the command estat bootstrap after the bootstrap defined model also gave a bias estimate. Isn't it necessary to evaluate the optimism? If so , how is this done?.
Below are the (edited) results that i got
. stcox BNP3200
Failure _d: dead==1
Analysis time _t: survivaltime
Cox regression with Breslow method for ties
No. of subjects = 2,042 Number of obs = 2,042
No. of failures = 1,019
Time at risk = 83,941.6503
LR chi2(1) = 90.38
Log likelihood = -6936.5657 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
_t | Haz. ratio Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
BNP3200 | 1.82114 .115614 9.44 0.000 1.608071 2.06244
. estat concordance
Harrell's C = (E + T/2) / P = 0.6014
Somers' D = 0.2028
. stcox BNP3200, vce(bootstrap, reps(500) seed (123))
(running stcox on estimation sample)
Cox regression with Breslow method for ties
Bootstrap results
No. of subjects = 2,042 Number of obs = 2,042
No. of failures = 1,019
Time at risk = 83,941.6503
Wald chi2(1) = 88.31
Log likelihood = -6936.5657 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| Observed Bootstrap Normal-based
_t | haz. ratio std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
BNP3200 | 1.82114 .1161697 9.40 0.000 1.60711 2.063674
------------------------------------------------------------------------------
Note that the coefficient is ln(1.82114) = 0.599462678
. estat bootstrap
Bootstrap results Number of obs = 2,042
Replications = 500
------------------------------------------------------------------------------
| Observed Bootstrap
_t | coefficient Bias std. err. [95% conf. interval]
-------------+----------------------------------------------------------------
BNP3200 | .59946272 .0016671 .06378955 .4845526 .723986 (BC)
------------------------------------------------------------------------------
Key: BC: Bias-corrected
. estat concordance
Harrell's C = (E + T/2) / P = 0.6014
Somers' D = 0.2028
Many thanks