Advice regarding estimating competing risk

Jonas Kristensen started a topic Advice regarding estimating competing risk

29 Jan 2019, 05:56
Advice regarding estimating competing risk
If i desire to calculate the cumulative incidence of both outcome (fracture=1/no fracture=0) and death (dead=1/alive=0) in a cohort, using competing risk, does the "status" variable that i am using have to have outcome, death and missing combined (outcome = 1 dead = 2 no outcome/no death=3). I have been trying with the command below, but it wont seem to work:

Code:

gen outcome = had_fracture gen endpoint = min(first_fracture_date, end_follow_date) stset endpoint, failure(outcome = 1) scale(365.25) origin(start_follow_date) stcompet= , compet(dead=1)

It is when using "stcompet= , compet(dead=1)", that i am suspecting that i need to combine outcome and dead into one variabel for it to be correct

And then to depict the (competing) cumulated incidence of fractures and death:

Code:

twoway /// (line cummort _t if status==1 , c(J)) /// (line cummort _t if dead==1 , c(J))

I cannot use dataex, because i am working with sensitive data, sorry. Please tell me if more information is needed for better comprehension, thank you
Tags: None

Jonas Kristensen replied

11 Jun 2019, 13:45

Dear Mr. Schecter. Thank you for your last reply and advice.
In my results i have included only the first fracture as an outcome and therefore censored multiple fractures for the same patient. If i wanted to include multiple fractures pr. patient, so the patient would not be censored at the first event of fracture, how should i change the outcome command in the command list below?

Code:

sort person_id stroke_date
by person_id (stroke_date): gen second_stroke_date = stroke_date[2]
by person_id (stroke_date): gen start_follow_date = stroke_date[1]
by person_id (admission_date), sort: egen first_post_stroke_fx_date = ///
min(cond(has_fracture_now & admission_date > start_follow_date, admission_date, .))
by person_id: egen diagnosiscode = min(cond(admissiondate == first_post_stroke_fx_date, diag, .))
by person_id (admission_date): gen end_follow_date = min(td(31dec2017), dødsdato, second_stroke_date)

gen dead = 1
replace dead = 0 if missing(death_date)
by person_id: egen died = max(dead)
by person_id: keep if _n == 1

gen outcome = !missing(first_post_stroke_fx_date)
gen endpoint = min(first_post_stroke_fx_date, end_follow_date)
stset endpoint, failure(outcome = 1) scale(365.25) origin(start_follow_date)

Leave a comment:

Clyde Schechter replied

09 Apr 2019, 12:12
What you wrote sounds good to me. Since this is not commonly done, and since my involvement in this thread may be providing me with a level of understanding a typical reader of your manuscript won't have, I suggest that you show your draft manuscript to somebody who has had no involvement with the project up to this point, but who knows enough about the general topic to understand it, and ask them to read it and give you comments. Ultimately, the purpose of writing is to share information with the reader. The success of that endeavor can only really be judged by a reader.
Leave a comment:
Jonas Kristensen replied

09 Apr 2019, 05:41
Thank you! It wouldn't feel right to do otherwise

That was very helpful information! Would it also be necessary to be very explicit in the results section, in terms of how the HR is to be interpreted or would it be fine to report the HR, desribe the association and then refer to the tabel for time varying estimates?

I have already written most of the article and i am editing it now to fit the results from the analysis with time varying covariates. So far i have written the following in the methods section in regards to TVC: "The proportional hazards assumption was not met for the use of cox proportional regression models, in terms of the variables of interest; stroke severity and civil status. A cox regression model with time varying covariates was therefore applied to estimate the effect of stroke severity and civil status on the cause specific hazard of fall-related fractures. Coefficients representing the rate of change of the hazard ratios were calculated and reported for all variables that could not be fit to cox proportional hazards model."
Leave a comment:
Clyde Schechter replied

08 Apr 2019, 12:35
Well, if the variables involved are included only to adjust for their effects, I wouldn't spend much effort on this. I would just report all the coefficients in a table similar to the -stcox- output itself and perhaps put an asterisk noting that these particular variables could not be fit to a proportional hazards model and that these coefficients therefore represent the rate of change of the hazard ratio over time.

If you have a variable that failed proportional hazards and is a key variable in the model, then you need to explain more. You should, in the text, report that the variable was not amenable to proportional hazards modeling, and indicate that a time-varying hazard with the rate of change indicated by the coefficient was observed.

You are to be congratulated for taking this level of care in your analysis and reporting. In the medical literature, Cox regression is used with abandon. The PH assumption is typically not even explored or tested and bogus hazard ratios are commonly reported as a result.
Leave a comment:
Jonas Kristensen replied

08 Apr 2019, 12:29
Great! Thank you.
So how would one report the tvc estimates in an article?
Most of the studies i have included, use cox-regression, but not one has used time varying covariates? I have read and searched alot on this topic, but it seems that there is not much to find.
Leave a comment:
Clyde Schechter replied

08 Apr 2019, 12:24
Yes, both are correct.
Leave a comment:
Jonas Kristensen replied

08 Apr 2019, 09:35
Hello again Mr. Schechter.
I apologize for spamming with several posts in a row, but i was wondering if you could help me understand how to interpret the stata-outputs from cox-regression with time-varying covarites.

Are the estimates that i get under "Main" still the hazard ratios that i would report in a scientific article?

And are the estimates under "tvc" the rate at which the hazard ratio changes over time (like you wrote previously)?

Last edited by Jonas Kristensen; 08 Apr 2019, 10:19.
Leave a comment:
Jonas Kristensen replied

04 Apr 2019, 09:49
The reason for the errors in the output was my mistake. I used the time scale(365250). With the time scale(365.25), the results look alot more understandable (see below).
I have a feeling that this might be the way to go. Now i just need to understand how to do the univariate (unadjusted) analysis for stroke severity and civil status with time varying parameters.

An aditional question: If the hazards for Age-groups and AMI are only ALMOST/CLOSE TO proportional, but in principal they are not, would it be wise to include them as time varying parameters or should i keep them as variables that fulfill the PH-assumption (i could also have age as a continuous Var). Because then the command could look like so:

Code:

stcox i.sex , tvc(i.stroke_severity i.civil_status c.age i.ami)texp(_t)

Attached Files
Last edited by Jonas Kristensen; 04 Apr 2019, 10:09.
Leave a comment:
Jonas Kristensen replied

04 Apr 2019, 04:10
Perfect, thank you.

In #56 you stated that; "Also, from a syntax perspective, if you specify the variable in -tvc()-, you do not also list it before the comma". So in regards to doing the univariate analysis with varying time parameters of stroke severity and civil status - how can i do this "command-wise", WITHOUT listing them both before the command and behind TVC?

From the command in #57, i get the output below (køn=sex, aldergrupper=age groups, indtotal_grp=stroke severity. The tvc estimates don't seem to be right, because of the e's and the coefficient of civil==9, is 193.

Last edited by Jonas Kristensen; 04 Apr 2019, 04:59.
Leave a comment:
Clyde Schechter replied

03 Apr 2019, 16:15
Yes, that sounds right.
Leave a comment:
Jonas Kristensen replied

03 Apr 2019, 15:47
Thank you so much for your help. Hmm, so if the only varaibles that have proportional hazards, are sex (ami and age groups could also be close to proportional) the command would look like this, even if my interest-variables were stroke severity and civil stauts?:

Code:

stcox i.sex i.agegroups i.ami , tvc(i.stroke_severity i.civil_status)texp(_t)

And then i would use stratified cox also, if i wanted to adjust for diabetes and alcohol intake?
Leave a comment:
Clyde Schechter replied

03 Apr 2019, 14:04
You are not going to get the same estimates you got with the original models. Using time-varying covariates in a Cox regression is very much like including an interaction between the covariate and time itself. The implication of specifying, say -tvc(i.stroke_severity)- is that there is no such thing as the azard ratio of stroke_severity. The model itself stipulates that there is a different hazard ratio associated with stroke_severity at different times. The output you will get will show you what is analogous to an interaction coefficient: it will be the rate at which the hazard ratio changes over time.

Also, from a syntax perspective, if you specify the variable in -tvc()-, you do not also list it before the comma.

I suggest you review the results of your explorations of proportional hazards assumptions and make a list of which of your predictors met the proportional hazards assumption (or at least were acceptably close) and which were not. It is only the latter that you want to consider putting in -tvc()-. Among those that you consider putting in -tvc()-, if there are some that are being included only to adjust for their effects but you are not interested in estimating their effects, then you could put those in -strata()- instead of -tvc()-.
Leave a comment:
Jonas Kristensen replied

03 Apr 2019, 13:16
Thank you for clarifying! So just to be absolutely sure of what i am doing when using cox-regression with time varying covariates. If i wanted to get exactly the same estimates as i did with the cox proportional hazards model, where did the following:
- univariate analysis of stroke severity
- univariate analysis of civil status
- multivariate analysis: stroke severity civil status sex age AMI diabetes and alchohol intake

The commands looked like this

Code:

stcox i.stroke_severity stcox i.civil_status stcox i.stroke_severity i.civil_status i.sex c.age i.ami i.diabetes i.alcohol

Would ALL the new commands that i needed to get exactly the same hazard ratios as above, but just for the time varying cox model, look like below?

Code:

stcox stroke_severity , tvc(i.stroke_severity) texp(_t) stcox civil_status , tvc(i.civil_status) texp(_t) stcox stroke_severity civil_status sex age ami diabetes alcohol_intake , tvc(i.stroke_severity) texp(_t) stcox civil_status stroke_severity sex age ami diabetes alcohol_intake , tvc(i.civil_status) texp(_t)

And is it only the variable in brackets af tvc that can be non-proportional? Because the only variable that is proportional of ALL of them, is sex
Leave a comment:
Clyde Schechter replied

03 Apr 2019, 10:25
When you stratify on, say sex, then you eliminate the need for proportional hazards for the sex variable. But it doesn't mitigate the need for proportional hazards for the variables you don't stratify on. So, to take a simple example, -stcox stroke_severity, strata(age sex)- eliminates the need for proportional hazards for age and sex, but you still need proportional hazards for stroke_severity. Also, bear in mind that in the stratified model, you cannot estimate the effects of the stratified variables--so this approach is also not helpful if estimating those effects is important to your research goals.

By "convergence" I mean this: when you run a Cox regression, Stata does maximum likelihood estimation to arrive at the coefficient estimates. But with complicated models, sometimes the maximum likelihood estimation fails--it gets trapped in a non-concave part of the likelihood, or marches off to positive or negative infinity. So you don't get any answers: you just get an infinite loop of iterations in the output. (Well, actually, it's not infinite--by default it stops at 16,000 iterations whether it has reached the likelihood's maximum or not, but for practical purposes that's infinite, and in the end, you don't have answers.) If this has never happened to you, then I can only say you are lucky!

The syntax of the code near the end of #53 looks right.
Leave a comment:

Announcement

Advice regarding estimating competing risk

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: