Issue with specifying boundary and internal knots using the stpm2cr command

Matt Warkentin

Join Date: May 2016

Posts: 104
#1

Issue with specifying boundary and internal knots using the stpm2cr command

19 Dec 2017, 10:25

Hi all,

I am trying to use the stpm2cr package in order to model the hazard of lung cancer with the competing risk of all-cause mortality. I am trying to estimate a simple single covariate (x1) example whereby I specify the boundary knots and the internals knots for the baseline hazard (using restricted cubic splines). However, when I use the bknots(#) and knots(#) suboptions, Stata kicks back the error that I can only use DF or KNOTS.

When I exclude the bknots and knots suboptions, and only specify the DF for the splines, it estimates and converges with no issues. However, I have meaningful (possibly) knots that I want to place, and I can't seem to get it to run. I have been unable to troubleshoot this error, but it may be a simple coding issue.

Here is a simplified example of the code that runs properly:

stpm2cr [ lung: x1 , scale(hazard) df(4) ] [ death: x1 , scale(hazard) df(4) ] , events(failure) cause( 1 2 ) cens(0) eform

Here is a simplified example of the code that will not execute:

stpm2cr [ lung: x1 , scale(hazard) bknots(50 70) knots(60 65) ] [ death: x1 , scale(hazard) bknots(50 70) knots(60 65) ] , events(failure) cause( 1 2 ) cens(0) eform

Thank you in advance for your help.

Matt

Last edited by Matt Warkentin; 19 Dec 2017, 10:29.
Tags: None
Sarwar Islam

Join Date: Jul 2015

Posts: 6
#2

20 Dec 2017, 03:21

Hi Matt,

Apologies, looks like this was a bug which I have now fixed so thank you for pointing it out!

Just also wanted to let you know that in stpm2cr - when specifying your own knots, it must be given on the log-time scale. I will make this clearer in the help file too... following the example given in the help file -

Code:

use http://www.stata-journal.com/software/sj4-2/st0059/prostatecancer stset time, failure(status==1, 2, 3) scale(12) id(id) noshow tab treatment, gen(trt) gen lnt = ln(_t) summ lnt stpm2cr [prostate: trt2, scale(hazard) bknots(-4.787492 1.845827) knots(-2 1)] /// [CVD: trt2, scale(hazard) bknots(-4.787492 1.845827) knots(-2 1)] /// [other: trt2, scale(hazard) bknots(-4.787492 1.845827) knots(-2 1)] /// , events(status) cause(1 2 3) cens(0) eform

I've attached the amended ado file which you can replace the old one with. I will update on SSC too.

Hope this solves the issue - let me know if any other problems arise.

Sarwar
Attached Files

stpm2cr.ado (40.8 KB, 1 view)
Comment
Matt Warkentin

Join Date: May 2016

Posts: 104
#3

20 Dec 2017, 13:08

Thank you for the response, Sarwar. Also, thank you for fixing this bug so quickly. Very much appreciated.

So just to clarify, if my natural time scale is age, and I want my lower boundary knot to be placed at 50 years, for example, I would define this numerically as ln(50) = 3.91. Is this correct?

While I have your audience -- I also wanted to clarify whether the baseline hazard is being modelled as piecewise cubic splines or RESTRICTED cubic splines? For example, when I specify my boundary knots, is the hazard beyond these knots a cubic function or restricted to a linear function?

Last edited by Matt Warkentin; 20 Dec 2017, 13:17.
Comment
Matt Warkentin

Join Date: May 2016

Posts: 104
#4

20 Dec 2017, 14:16

Hi Sarwar,

One follow-up issue I am now running into -- whenever I try to include more than a single covariate in the models, the model will not execute and returns error r(301) "last estimates not found".

This issue does not occur with the stpm2 package. Why would this occur in the competing risk model?

Example code:

stpm2cr [prostate: trt2 age , scale(hazard) df(4)] [CVD: trt2 age , scale(hazard) df(4)] [Other: trt2 age , scale(hazard) df(4)] , events(status) cause(1 2 3) cens(0) eform

Thanks for your help.

Matt

Last edited by Matt Warkentin; 20 Dec 2017, 14:21.
Comment
Sarwar Islam

Join Date: Jul 2015

Posts: 6
#5

03 Jan 2018, 04:56

Hi Matt,

Apologies for the delayed response, I have only just seen this now.

So just to clarify, if my natural time scale is age, and I want my lower boundary knot to be placed at 50 years, for example, I would define this numerically as ln(50) = 3.91. Is this correct?

Yes, this is correct. I will soon add a -knscale- option similar to stpm2 so that this can be more flexible.

While I have your audience -- I also wanted to clarify whether the baseline hazard is being modelled as piecewise cubic splines or RESTRICTED cubic splines? For example, when I specify my boundary knots, is the hazard beyond these knots a cubic function or restricted to a linear function?

Baseline hazard is being modelled as restricted cubic splines. So as you say, before the first boundary knot and after the last boundary knot the function is forced to be linear.

One follow-up issue I am now running into -- whenever I try to include more than a single covariate in the models, the model will not execute and returns error r(301) "last estimates not found".

This issue does not occur with the stpm2 package. Why would this occur in the competing risk model?

Looks like this is a bug in the mata file which I will correct and get back to you on. In the meantime, if you use the option -old- which just fits the model using a marginally slower method, you should be able to fit the model.

I will update you once I have corrected the issue!

Sarwar
1 like
Comment
Sarwar Islam

Join Date: Jul 2015

Posts: 6
#6

03 Jan 2018, 07:57

I will update you once I have corrected the issue!

Corrected - a version is attached and an update will be uploaded onto SSC.

Any other issues let me know!

Sarwar
Attached Files

stpm2cr.ado (40.9 KB, 1 view)
1 like
Comment
Matt Warkentin

Join Date: May 2016

Posts: 104
#7

03 Jan 2018, 15:32

Hi Sarwar,

Thank you again for your prompt updating. I am now able to include multiple covariates in the model. I do have a few follow-up questions...

1. Can a concordance statistic (e.g. Harrell's C-statistic) be calculated after fitting the stpm2cr model? For example using Cox PH regression:

Code:

stcox variable estat concordance

2. As mentioned in previous comments, I am trying to use age as my natural time scale (as I have arbitrary but "meaningful" ages where I want to place knots). So for example, I have stset my data as follows:

Code:

stset event_age , failure(failure==1 , 2) enter(age) exit(event_age) id(id)

This code will lead to estimating a delayed entry model when using stpm2cr. However, now it is not clear to me what the knot values are if I want to place my own boundary and internal knots. My arbitrary age values were 50 and 70 years for boundary knots, and 60 and 65 years for internal knots. I don't believe taking the natural log of these values is the correct use, as the way I have stset the data informs Stata that the time at risk is the difference between their baseline age and the age the event occurred. It may be that I am stset'ing my data wrong, and a delayed entry model is incorrect.

So to summarize and concisely present my question, if I want to use age as the timescale, how can I determine the knot value inputs so they correspond to my selected ages?

3. Also, sometimes when including multiple predictors in the model, I get the following error:

Code:

[initial values infeasible, retrying with -initweight(10)- option] command stpm2crinit is unrecognized r(199);

Last edited by Matt Warkentin; 03 Jan 2018, 16:23.
Comment
Matt Warkentin

Join Date: May 2016

Posts: 104
#8

12 Jun 2018, 08:24

Hi Sarwar,

Thanks for the response and congratulations on submitting your PhD. I appreciate your detailed response and look forward to an official update.

Also, thank you for sharing these files. I will try to get around to testing them when I have the chance.
Comment
Sarwar Islam

Join Date: Jul 2015

Posts: 6
#9

12 Jun 2018, 09:13

Hi Matt,

Apologies, this has been a VERY late reply as I had been in the middle of submitting my PhD!

I have come back to this however, and the bug for including multiple covariates has been fixed.

1. This can be done, but isn't currently available post-estimation. However, since all the model parameters are stored, the user can calculate this if they wish to. Hopefully in the future I can extend the command for making these calculations post-estimation.

2. I think there might be an error in what you stset - for your enter variable, should it not be the date at which the patient enters the study e.g. date of diagnosis? When using age as the timescale, it should be a delayed entry model.

3. This is a mistake in calling the stpm2crinit command which does not exist and should instead be calling stpm2cr again. I have fixed this too. Just a note on why it is re-running the command with the initweight option - an issue with models on the cumulative incidence scale is that there is no natural constraint for the total CIF to equal 1. This usually occurs when mortality for a group of patients is particularly high. This leads to fewer events towards the end of follow-up time which may lead to unstable initial values. Therefore, to force the model to provide better initial values, weights are applied towards the end of follow-up time.

In any case, in the updated version (attached) now calls stpm2cr correctly so you should no longer get this error (another issue might be problems with convergence).

(Note that there are additional ado files with additional features which will be communicated in an official update).

Again, apologies that I couldn't respond sooner!

Sarwar

Attached Files

stpm2cr_pred_csh.ado (1.7 KB, 2 views)

stpm2cr_matacode.mata (9.7 KB, 2 views)

stpm2cifgq.ado (33.5 KB, 2 views)

stpm2cr_pred.ado (55.4 KB, 2 views)

stpm2cr.ado (41.7 KB, 1 view)
1 like
Comment
Pedro Ochoa allemant

Join Date: Apr 2022

Posts: 6
#10

01 Aug 2024, 21:43

Hi all,

I have been trying to use stpm2cr in STATA v18 but I get an error when trying to use a factor variable (unos_e which has 3 levels). Below my code:

Code:

stset waitlist_month, failure(lstatus==1, 2) exit(time 60) id(pt_code) stpm2cr [Death: i.unos_e gender_list, scale(hazard) df(4)] [Txp: i.unos_e gender_list, scale(hazard) df(4)], events(lstatus) cause(1 2) cens(0) eform

However, when I removed the factor-variable notation, it runs without issues. Is there a way to go around this?

Thank you,
Pedro
Comment

Announcement

Issue with specifying boundary and internal knots using the stpm2cr command

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment