Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cox Regression - observations end on or before enter()

    Dear StataList Members,

    I am trying to conduct a surival anaylsis using the data I placed here.

    The code I use is here: stset age_months , failure(monthly_duration==0)

    age_months is the age of children given in months. It is my time variable. It starts from 0. It ends in 59.
    monthly_duration is a dichotomous variable that indicates whether the child is still breastfed or not. If 1, the child is still breatfed. If 0, not still breastfed. It is my event variable.

    When I run this code:

    stset age_months , failure(monthly_duration==0), I get the notification :
    207691 total observations
    7625 observations end on or before enter()

    I know why this is the case. In my data, 7625 of observations are children who are 0 month old (age_months=0), and all of those 7625 children are still brastfed. There is no exception.
    That is, all those 7625 children take the value of 1 for their monthly_duration variable.

    My question is here: Can I proceed my analysis? Is that true? Because now the 7625 children are lost. My N is 200066 rahther than 207691 (the difference is 7625). If it is not true, what should I do?

    Thank you very much.



    clear
    input float(age_months monthly_duration) double M4_1 str15 CASEID float age_months2
    0 1 20 " 10110 2" 21
    1 1 20 " 10110 2" 21
    2 1 20 " 10110 2" 21
    3 1 20 " 10110 2" 21
    4 1 20 " 10110 2" 21
    5 1 20 " 10110 2" 21
    6 1 20 " 10110 2" 21
    7 1 20 " 10110 2" 21
    8 1 20 " 10110 2" 21
    9 1 20 " 10110 2" 21
    10 1 20 " 10110 2" 21
    11 1 20 " 10110 2" 21
    12 1 20 " 10110 2" 21
    13 1 20 " 10110 2" 21
    14 1 20 " 10110 2" 21
    15 1 20 " 10110 2" 21
    16 1 20 " 10110 2" 21
    17 1 20 " 10110 2" 21
    18 1 20 " 10110 2" 21
    19 1 20 " 10110 2" 21
    20 1 20 " 10110 2" 21
    21 0 20 " 10110 2" 21
    0 1 17 " 102 3 2" 59
    1 1 17 " 102 3 2" 59
    2 1 17 " 102 3 2" 59
    3 1 17 " 102 3 2" 59
    4 1 17 " 102 3 2" 59
    5 1 17 " 102 3 2" 59
    6 1 17 " 102 3 2" 59
    7 1 17 " 102 3 2" 59
    8 1 17 " 102 3 2" 59
    9 1 17 " 102 3 2" 59
    10 1 17 " 102 3 2" 59
    11 1 17 " 102 3 2" 59
    12 1 17 " 102 3 2" 59
    13 1 17 " 102 3 2" 59
    14 1 17 " 102 3 2" 59
    15 1 17 " 102 3 2" 59
    16 1 17 " 102 3 2" 59
    17 1 17 " 102 3 2" 59
    18 0 17 " 102 3 2" 59
    19 0 17 " 102 3 2" 59
    20 0 17 " 102 3 2" 59
    21 0 17 " 102 3 2" 59
    22 0 17 " 102 3 2" 59
    23 0 17 " 102 3 2" 59
    24 0 17 " 102 3 2" 59
    25 0 17 " 102 3 2" 59
    26 0 17 " 102 3 2" 59
    27 0 17 " 102 3 2" 59
    28 0 17 " 102 3 2" 59
    29 0 17 " 102 3 2" 59
    30 0 17 " 102 3 2" 59
    31 0 17 " 102 3 2" 59
    32 0 17 " 102 3 2" 59
    33 0 17 " 102 3 2" 59
    34 0 17 " 102 3 2" 59
    35 0 17 " 102 3 2" 59
    36 0 17 " 102 3 2" 59
    37 0 17 " 102 3 2" 59
    38 0 17 " 102 3 2" 59
    39 0 17 " 102 3 2" 59
    40 0 17 " 102 3 2" 59
    41 0 17 " 102 3 2" 59
    42 0 17 " 102 3 2" 59
    43 0 17 " 102 3 2" 59
    44 0 17 " 102 3 2" 59
    45 0 17 " 102 3 2" 59
    46 0 17 " 102 3 2" 59
    47 0 17 " 102 3 2" 59
    48 0 17 " 102 3 2" 59
    49 0 17 " 102 3 2" 59
    50 0 17 " 102 3 2" 59
    51 0 17 " 102 3 2" 59
    52 0 17 " 102 3 2" 59
    53 0 17 " 102 3 2" 59
    54 0 17 " 102 3 2" 59
    55 0 17 " 102 3 2" 59
    56 0 17 " 102 3 2" 59
    57 0 17 " 102 3 2" 59
    58 0 17 " 102 3 2" 59
    59 0 17 " 102 3 2" 59
    0 1 24 " 10217 1" 41
    1 1 24 " 10217 1" 41
    2 1 24 " 10217 1" 41
    3 1 24 " 10217 1" 41
    4 1 24 " 10217 1" 41
    5 1 24 " 10217 1" 41
    6 1 24 " 10217 1" 41
    7 1 24 " 10217 1" 41
    8 1 24 " 10217 1" 41
    9 1 24 " 10217 1" 41
    10 1 24 " 10217 1" 41
    11 1 24 " 10217 1" 41
    12 1 24 " 10217 1" 41
    13 1 24 " 10217 1" 41
    14 1 24 " 10217 1" 41
    15 1 24 " 10217 1" 41
    16 1 24 " 10217 1" 41
    17 1 24 " 10217 1" 41
    end
    label values M4_1 M4_1
    [/CODE]

  • #2
    Yes, to analyses these data you just add a small quantity to the observations with age 0 months (see blog post linked below). I think there is a reasonable argument to add 0.5 to all values of age (rather than just those with age zero).

    Stata assumes that time is measured with infinite precision, so a person who experiences the event at time 0 has never been at risk and is excluded. In practice, we measure time in completed units rather than infinite precision so a recorded value of zero actually means greater than zero but not equal to or greater than 1.

    Many common survival analysis methods, such as Kaplan-Meier and Cox regression, are non (or semi) parametric so only the ordering of the times is important. Adding a constant to all times will not affect the results. If you are using parametric methods then one can argue that adding 0.5 to all times results in a better estimate of person-time at risk. That is, one assumes that individuals with a recorded time of t months have been at risk for t+0.5 months.

    Here's a blog post by Bill Gould with details.
    https://www.stata.com/support/faqs/s...and-cox-model/

    Comment


    • #3
      To add to Paul's comment, here's a Statalist thread that help me a lot in understanding how to treat observations that fail at t=0 (I am experiencing the same issue in my survival analysis). The thread concludes pretty much with the t + 0.5 procedure that Paul mentions, but there is some discussion that I think is worth checking.

      https://www.statalist.org/forums/for...e-at-time-zero

      Cheers, J.

      Comment


      • #4
        Dear Professor,

        I am really thankful for your reply. It is really helpful for me.

        At that point, I have another question for you. My data looks like an unbalanced panel data because I observe each child with his/her ages (given in month). For example, assume that the child havind ID number 1 is 2 months old. I can observe this child up to his/her 2nd months as follows. Do need to do something special to run unbalanced cox model? Thank you very much in advance.

        ID Age Monthly_Durarion
        1 0 1
        1 1 1
        1 2 0
        2 0 1
        2 1 1

        Originally posted by Paul Dickman View Post
        Yes, to analyses these data you just add a small quantity to the observations with age 0 months (see blog post linked below). I think there is a reasonable argument to add 0.5 to all values of age (rather than just those with age zero).

        Stata assumes that time is measured with infinite precision, so a person who experiences the event at time 0 has never been at risk and is excluded. In practice, we measure time in completed units rather than infinite precision so a recorded value of zero actually means greater than zero but not equal to or greater than 1.

        Many common survival analysis methods, such as Kaplan-Meier and Cox regression, are non (or semi) parametric so only the ordering of the times is important. Adding a constant to all times will not affect the results. If you are using parametric methods then one can argue that adding 0.5 to all times results in a better estimate of person-time at risk. That is, one assumes that individuals with a recorded time of t months have been at risk for t+0.5 months.

        Here's a blog post by Bill Gould with details.
        https://www.stata.com/support/faqs/s...and-cox-model/

        Comment


        • #5
          Dear Prodessor,

          Thank you so much for your reply. I have checked the discussion and it helped me a lot. Now, I have another problem.
          My data looks like an unbalanced panel data because I observe each child with his/her ages (given in month). For example, assume that the child havind ID number 1 is 2 months old. I can observe this child up to his/her 2nd months as follows. Do need to do something special to run unbalanced cox model? Thank you very much in advance.

          ID Age Monthly_Durarion
          1 0 1
          1 1 1
          1 2 0
          2 0 1
          2 1 1

          Originally posted by Jesus Pulido View Post
          To add to Paul's comment, here's a Statalist thread that help me a lot in understanding how to treat observations that fail at t=0 (I am experiencing the same issue in my survival analysis). The thread concludes pretty much with the t + 0.5 procedure that Paul mentions, but there is some discussion that I think is worth checking.

          https://www.statalist.org/forums/for...e-at-time-zero

          Cheers, J.

          Comment


          • #6
            Hi Cansu,

            I happened to know the answer to your question above because I am familiar with the adoption at t=0 issue in survival models (I am facing the same problem in my thesis writing). Unfortunately, I am not familiar with the unbalanced Cox model. Perhaps someone else here can provide more insights? - J

            Comment

            Working...
            X