Immortal time bias: how to match observations on multiple variables

LarsFolkestad

Join Date: Sep 2014

Posts: 165
#1

Immortal time bias: how to match observations on multiple variables

09 Jun 2016, 01:55

Dear Listers
This is perhaps a more theoretical question than a directly stata related question, but there is a question about code in there as well.

The problem: In a study of survival i have two cohorts (exposed and unexposed). Observation starts at the time X, and ends at the time Z. Time Z is defined as end of obs or time of death.
At some point in time between X and Z the exposed become exposed (at time X+Ndays). I want to calculate the HR for death between the exposed and the unexposed. Exposure in this is specific treatment.

I have, however, introduced an immortal time of Ndays to the exposed group as i condition them on a future event - you have to be alive at the time of exposure and the unexposed don't. This gives a HR above 1.00 when comparing the unexposed to the exposed.

Well i know that the exposed are unexposed until they in fact become exposed. So i thought i might split my data on exposure, saying that the exposed are unexposed until they get exposed. That, on the other hand, only moves the immortal Ndays over to the unexposed group and i get a HR (when comparing the unexposed to the exposed) closer to 1.00 or even lower than 1.00.
I have tried a conditional landmarking approach, saying that we start the observations time at a set point in time and define who is and who is not exposed at this time and compare these groups. This did not change much, i leave out about half of my cohort that later than 1year after start of observation become exposed - i loose power.

So i thought, what if i just disregard the Ndays - but in both cohorts. I match my cohorts on what ever co-variates i would normally put in my Cox model (gender, age at start, charlston comorbidity for example) and dropping all of the unexposed that are not alive at the time their matched exposed counterpart gets exposed - and than run the Cox model.

My questions are:
1) Is this a feasible way to go about it?
2) What about those exposed/unexposed that cannot be matched with the other cohort

3) could I use the Ndays until exposure in the model?
thought i might put it in as a continuous variable looking at the Hazard increase by each day in the Ndays - thus being able to say something about the effect of prolonging time to exposure.
But the Ndays should it than be:
or the exposed= days from start obs until the exposure date
for the unexposed=days from start obs until end obs OR days from start obs until the matched exposed counterpart gets exposed?

4) How would i go about matching the two cohorts and how do i figure out if the unexposed have died prior to the matched exposed counterpart gets exposed.

I provide you with a mock dataset (the original dataset has 4500 exposed and 4200 unexposed)

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input float(id case gender cci dead start_of_obs expo_date end_of_obs age_start_obs) 1 0 0 0 0 18518 . 19358 76 2 1 1 2 1 17358 17601 17843 43 3 1 0 0 1 16025 16652 16670 70 4 1 1 1 0 19323 19316 19358 58 5 0 0 3 0 18609 . 19358 41 6 1 0 0 0 19268 19141 19358 31 7 0 1 0 1 10757 . 11119 73 8 1 1 1 0 18745 18995 19358 40 9 0 0 1 1 11153 . 11650 34 10 0 0 3 0 19342 . 19358 67 11 1 1 2 0 18815 19215 19358 65 12 0 1 3 0 19312 . 19358 80 13 1 0 1 0 18881 19010 19358 64 14 0 0 2 1 18407 . 19088 43 15 0 0 1 1 11627 . 12352 68 16 0 0 1 0 18619 . 19358 32 17 0 1 2 1 10231 . 10920 40 18 1 0 3 0 18975 18979 19358 29 19 1 0 0 1 14682 15114 15478 63 20 1 0 0 0 18899 19262 19358 37 21 1 0 0 0 18481 19150 19358 79 22 0 0 3 1 12897 . 13429 57 23 0 1 3 1 11887 . 12346 27 24 1 0 2 1 12792 13318 13568 47 25 1 1 0 1 7435 7880 7954 20 end format %td start_of_obs format %td expo_date format %td end_of_obs

ids are unique
case indicates if you are exposed or unexposed
cci is the charlson comorbidity

Hope my questions are somewhat understandable.

Lars
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

09 Jun 2016, 08:49

I don't know if I understand what you are asking or not. But to the extent I do, it sounds to me like it is handled by the appropriate use of the -origin()- and -enter()- options in -stset-. -enter()- designates the time that the subject comes under observation, and -origin()- designates the time that the subject becomes at risk for the failure outcome. It sounds to me like your Ndays represents the interval between those two. Am I on the right track here?
Comment
LarsFolkestad

Join Date: Sep 2014

Posts: 165
#3

09 Jun 2016, 13:29

Clyde Schechter Thank you.
I think i took a difficult problem and made it incomprehensible.

Not sure that the -enter()- -origin()- solves my problem. I will try to make it a bit more clear.

i hava a cohort of patients diagnosed between 1977 and 2012. and the end of observation is at the end of 2012.
some patients will be treated for the disease and some will go through life un treated. lets just assume that the ones that are treated are equal to those that are un treated and the decision to treat is random within the cohort of ill (its probably not but lets take that discussion another time).

Now i want to look at the effect of treatment on mortality risk.

The problem is that the diagnosis is made at the time==X
Treatment is started at the time==X+N
This N can be anything from 0days to 33years.

The exposed are defined as those started in treatment at any given time looking back from the end of 2012. So all the treated patients are immortal for N days - as they have to be alive long enough to be treated.

Now lets look at a very small sample

id1 is diagnosed at time = 0
Id2 is diagnosed at time = 4

id1 starts treatment at time = 300 days
id2 never starts treatment

id1 dies at time == 450 days
id2 dies at time == 200 days

If we compare the two directly treatment is good, but if id2 had lived longer treatment may have been started at some point - we just don't know.

the following pair is more what i think i want.
id3 is diagnosed at time = 0
id4 is diagnosed at time = 3

id3 starts treatment at time = 300 days
id4 is never treated

id3 dies at time = 400
id4 dies at time = 350

///
so to overcome immortal time - could (and should) i match my to cohoes (treated // never(as far as we know) treated) on age, gender, cci and exclude pairs where the untreated dies prior to the matched treated counterpart starts treatment.

if yes, how do i do that in the example in #1

Lars
Comment
Mads Lillevang-Johansen

Join Date: Sep 2015

Posts: 22
#4

09 Jun 2016, 23:07

I am following this
Comment

Announcement

Immortal time bias: how to match observations on multiple variables

Comment

Comment

Comment