Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Enter vs. Origin - Landmark survival analysis

    I'm working on a survival analysis for patients undergoing cancer treatment. To account for immortal time bias, I want to perform a Landmark analysis.

    I have a variable that provides the survival time, from day of diagnosis to last follow-up/death (var: DX_LASTCONTACT_DEATH_MONTHS).
    I have a second variable that provides the time from diagnosis to first "treatment". (var: DX_RX_STARTED_DAYS)

    I want my landmark time to be the the day of first treatment. As a sensitivity analysis, I want to also set the landmark time to 30 and 90 days.

    My code is:

    stset DX_LASTCONTACT_DEATH_DAYS, failure(PUF_VITAL_STATUS==0) origin(time DX_RX_STARTED_DAYS)

    For the 30 and 90 days:

    stset DX_LASTCONTACT_DEATH_DAYS, failure(PUF_VITAL_STATUS==0) origin(time 30)

    I can't really figure out if I should use "origin" or "enter" to perform the Landmark analysis. I've read the manual and can't really figure it out.

    What is the correct code: "enter" or "origin"?

  • #2
    Short answer, it's enter() you want but because of the way your dates are defined I think you need both enter() and origin(). I'll give an example using one of my teaching data sets.

    Using terminology common in medical statistics, I think of the two options as follows:

    origin() defines the time origin; the time at which the timescale is zero.

    enter() specifies when individuals become at risk (become under observation).

    The example data are for patients diagnosed with skin melanoma; dx is the date of diagnosis and exit is the date of exit (death or censoring). A standard analysis would be as follows:

    Code:
    use https://pauldickman.com/data/melanoma.dta if stage>0, clear
    stset exit, fail(status==1) origin(dx) scale(365.24)
    The time origin (definition of time zero) is set to the date of diagnosis for every individual. Because enter() is not specified, it is assumed that everyone becomes at risk at time zero, which is usually what we want.

    If we wanted to use attained age as the timescale (not sensible for these data but I want to illustrate the Stata commands) we would use the following code. I don't have date of birth (dob) in the data so will approximate it from age at diagnosis and date of diagnosis.

    Code:
    generate dob=dx-age*365.241
    stset exit, fail(status==1) origin(dob) enter(dx) scale(365.24)
    Time zero is now the date of birth. Patients become at risk at their date of diagnosis but if you look at _t0 and _t you will see that these are the ages at dx and exit. We are using attained age as the time scale and there is late entry on that timescale.

    If you want patients to enter 30 days after diagnosis then you'd do the following:

    Code:
    stset exit, fail(status==1) origin(dx) scale(365.24) enter(time dx + 30)
    Here we are saying that time zero is the date of diagnosis, but individuals do not become at risk until 30 days after diagnosis.

    Because of the way you have pre-calculated survival time from day of diagnosis to last follow-up/death, your origin will be zero by definition (which is the default).

    Code:
    stset DX_LASTCONTACT_DEATH_DAYS, failure(PUF_VITAL_STATUS==0) enter(time DX_RX_STARTED_DAYS)
    stset DX_LASTCONTACT_DEATH_DAYS, failure(PUF_VITAL_STATUS==0) enter(30)
    I recommend looking at _t=, _t, _d, and _st to ensure Stata is doing what you hope.
    Last edited by Paul Dickman; 24 Sep 2020, 08:59. Reason: Edit: Reread the OP more carefully.

    Comment


    • #3
      Dr. Dickman,

      Thanks for your detailed response! I've checked the _t and _t0 - they match the times I want.

      Comment

      Working...
      X