survival KM curve - what to do if missing time variable

Martin Imelda Borg

Join Date: Jan 2022
Posts: 225

survival KM curve - what to do if missing time variable

24 Jul 2023, 04:14

Hi I am creating a KM curve to detect the time when the event (revision) happened i.e revision = 1
As you can see I have the date when revision took place (revisiontime).
I generated a time variable (time in years from date of revision from the initial date of surgery - yearsurgery)
As you can see if a revision did not occur revision = 0 the timeuntilrevisionyr3 == .

Once I start setting up my survival data

Code:

stset timeuntilrevisionyr3, failure(revision==1)

I get 8 missing

How do I address this ? Or can I generate a rolling time variable if missing assume that they got to the end of the study without being revised.
It is correct to say that if that patient was not recorded as revised then the patient was not revised.

My solution would be - finding the maxdate of revision surgery which would mark the end of the study and inserting this where timeuntilsurgery3 == .
Is this a good way to go about it?

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(SurgeonID Surgery revision) str1 experience float(yearsurgery practiceyears revisiontime timeuntilrevision timeuntilrevisionyr timeuntilrevisionyr3) byte(_st _d) double _t byte _t0
 1 1 0 "0" 14611         0     .    .         .         . 0 .                 . .
 2 1 1 "0" 14610         0 16074 1464       122 4.0109587 1 1 4.010958671569824 0
 3 1 1 "0" 14611         0 16075 1464       122 4.0109587 1 1 4.010958671569824 0
 4 1 0 "0" 15768         0     .    .         .         . 0 .                 . .
 5 1 1 "0" 16865         0 17596  731  60.91667 2.0027397 1 1 2.002739667892456 0
 7 1 1 "0" 17628         0 17993  365 30.416666         1 1 1                 1 0
 8 1 1 "0" 18271         0 19001  730  60.83333         2 1 1                 2 0
 9 1 1 "0" 16440         0 17536 1096  91.33334   3.00274 1 1 3.002739667892456 0
10 1 0 "0" 18243 3.0931506     .    .         .         . 0 .                 . .
10 1 0 "0" 18243 3.0931506     .    .         .         . 0 .                 . .
10 1 1 "2" 18277 3.0931506 20162 1885 157.08333  5.164383 1 1 5.164383411407471 0
10 1 0 "3" 19372 3.0931506     .    .         .         . 0 .                 . .
12 1 1 "0" 16167  .8438356 16532  365 30.416666         1 1 1                 1 0
12 1 0 "0" 16167  .8438356     .    .         .         . 0 .                 . .
12 1 0 "2" 16444  .8438356     .    .         .         . 0 .                 . .
end
format %td yearsurgery
format %td revisiontime

Last edited by Martin Imelda Borg; 24 Jul 2023, 05:11.

Tags: None

Martin Imelda Borg

Join Date: Jan 2022

Posts: 225
#2

24 Jul 2023, 12:10

Found a solution for this, for anyone who refers to this.

If an event did not occur eg revision = 0 (this means that the value of _t should be equivalent to the last date of the study (max date) unless of course the patient died.

Therefore you should generate a timevariable in this case mine was
timeuntilrevisionyr3 However, this only took the time if an event (revision = 1) occured. Therefore for those where revision = 0 this should either take the last date of your study OR if the patient died take this date. Otherwise if you leave this as empty than your _t will have missing variables.
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2403
#3

24 Jul 2023, 12:16

You appear to have multiple records per person here, so I’m not certain you have setup your -stset- command correctly. But I don’t understand well enough your data structure. If you do indeed have multiple records per patient, those missings would be a sign of an error.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30122
#4

24 Jul 2023, 12:19

Or can I generate a rolling time variable if missing assume that they got to the end of the study without being revised.
It is correct to say that if that patient was not recorded as revised then the patient was not revised.

My solution would be - finding the maxdate of revision surgery which would mark the end of the study and inserting this where timeuntilsurgery3 == .
Is this a good way to go about it?

You're on the right track, but the details are not correct. If you had a system whereby you could know for sure that the absence of a revision date means that the surgery was never revised, then, yes, you could use the end of study date for their time until revision, and use revision == 0 to mark censored observations. That is, you would replace timeuntilrevision by the end of study date and in your -stset- command specify -failure(revision == 1)-.

But this sounds like human subjects medical data. And in real life we are seldom in a position to really say that the patient never had surgery revision just because we don't happen to know about it. Patients get lost to follow-up for many reasons: they move away, they die of related or unrelated causes, they had the revision at another facility and never came back to yours again, the patient had a revision at your facility but the records got lost, and on and on. So what you have to do is identify the last date at which you really are certain that the patient was still observed in your study and known not to have had any revision. Then you put that date as timeuntil revision, keep revision equal to 0 in these observations, and then specify -failure(revision == 1)- in your -stset- command.

This approach is fundamental to the handling of time-to-event data and is gone over in every introductory survival analysis text I have ever seen. If you do not find my explanation of it clear, then consult such a text for a longer, hopefully better, explanation, and probably some worked examples.
Comment
Martin Imelda Borg

Join Date: Jan 2022

Posts: 225
#5

24 Jul 2023, 14:07

Very true Clyde Schechter but in certain situations, sometimes there is a 90% guarantee that a patient has never had further surgery (revision) as some databases are compulsory. So it would be fair to assume that the last date for someone who has never had further surgery (revision) would be the max date of the study or if the patient died -> date of death - date of surgery.

However, that being said, is there a way to display and differentiate on KM graphs those which have had a revision vs those who have died?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30122
#6

24 Jul 2023, 14:46

Very true Clyde Schechter but in certain situations, sometimes there is a 90% guarantee that a patient has never had further surgery (revision) as some databases are compulsory.

Well, I think it is a matter of judgment how much faith to put in that guarantee. First, all databases have reporting lags. So you probably don't reach a high level of guarantee until a couple years. Then there is the possibility that the revision is done someplace outside the jurisdiction of the database. State and local databases are particularly subject to that as people travel fairly freely among states and lower-level jurisdictions for health care. Even national databases can miss things as people might travel specifically to a foreign country to obtain treatment at lower prices than available in the US, or might just coincidentally be abroad when the need for the procedure arises. Admittedly this last one is not very common, but you can't disregard these things altogether.

That said, if you are dealing with a national-level database as your source, and it is one of the databases that has a solid infrastructure that does capture very close to 100% of its target information, and if you factor in the reporting delay that is typical of that data base, then I think you wold be OK assuming that people did not have the revision if there is no such entry for them in the database up to the last day for which we are sure that reporting is complete. I would then use that date, or their date of death, whichever is earlier, as the revision date, again with revision == 0 and -failuire(revision == 1)- in -stset-.

However, that being said, is there a way to display and differentiate on KM graphs those which have had a revision vs those who have died?

If you handle it the way I have recommended, designating these as censored observations, then they do not appear in the KM graphs as revision events. They just change the denominator of cases at risk for revision--but that isn't visible in the graphs.

Last edited by Clyde Schechter; 24 Jul 2023, 14:49.
Comment

Announcement

survival KM curve - what to do if missing time variable

Comment

Comment

Comment

Comment

Comment