Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • survival analysis about cloglog model

    I've been studying the cloglog model recently, and I've learned some materials from Jenkins-lesson6-disrcrete time model .
    About data reorganisation ," The binary dependent variable also needs to be created. If subject i’s survival time is censored, the binary dependent variable is equal to 0 for all of i’s spell months; if subject i’s survival time is not censored, the binary dependent variable is equal to 0 for all but the last of i’s spell months (month 1,..., Ti–1) and equal to 1 for the last month (month Ti)."
    As a beginner, I have a little question.

    If the data is like this, what should I do?
    studytim is the survey date,y is the failure event,x1 and x2 is explanatory variable.

    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(studytim y x1 x2 id y1)
    1989 0 1 61 1 0
    1989 1 1 56 2 0
    1989 1 1 63 3 0
    1989 1 1 56 4 0
    1991 0 1 65 1 0
    1991 1 0 56 2 0
    1991 1 1 63 3 0
    1991 1 1 56 4 0
    1993 0 0 59 1 0
    1993 1 0 67 2 0
    1993 1 1 58 3 0
    1993 1 0 56 4 0
    1997 0 0 59 1 0
    1997 1 0 67 2 0
    1997 0 1 58 3 0
    1997 1 1 56 4 0
    2000 0 1 52 1 0
    2000 1 1 67 2 0
    2000 1 0 58 3 0
    2000 0 0 56 4 0
    2004 0 0 52 1 0
    2004 1 1 67 2 0
    2004 1 0 58 3 0
    2004 0 0 56 4 0
    2006 0 1 52 1 0
    2006 1 1 63 2 0
    2006 1 1 58 3 0
    2006 0 0 58 4 0
    2011 0 0 56 1 0
    2011 1 0 63 2 0
    2011 1 1 56 3 0
    2011 0 1 58 4 0
    2015 0 1 56 1 0
    2015 1 1 63 2 1
    2015 0 0 56 3 0
    2015 0 1 58 4 0
    end
    sort id studytim
    bysort id: ge y1= y == 1 & _n==_N //it seems tha there are some mistakes about the new variables "y1"
    cloglog y1 x1 x2, r nolog
    ***another model
    xtset studytim id
    xtcloglog dead drug age, pa nolog

    [/CODE]
    ------------------ copy up to and including the previous line ------------------

    Question :Observe 3 individuals and 4 individuals,the new variable y1 is wrong .
    How should I solve it? I would appreciate it if someone helped me.(Jenkins's lesson introduced the single-spell data, but how to operate if the event enters again after exiting?)

  • #2
    Here is what the data look like after your -sort id studytim- line:

    Code:
    . list, sepby(id)
    
         +----------------------------------+
         | studytim   y   x1   x2   id   y1 |
         |----------------------------------|
      1. |     1989   0    1   61    1    0 |
      2. |     1991   0    1   65    1    0 |
      3. |     1993   0    0   59    1    0 |
      4. |     1997   0    0   59    1    0 |
      5. |     2000   0    1   52    1    0 |
      6. |     2004   0    0   52    1    0 |
      7. |     2006   0    1   52    1    0 |
      8. |     2011   0    0   56    1    0 |
      9. |     2015   0    1   56    1    0 |
         |----------------------------------|
     10. |     1989   1    1   56    2    0 |
     11. |     1991   1    0   56    2    0 |
     12. |     1993   1    0   67    2    0 |
     13. |     1997   1    0   67    2    0 |
     14. |     2000   1    1   67    2    0 |
     15. |     2004   1    1   67    2    0 |
     16. |     2006   1    1   63    2    0 |
     17. |     2011   1    0   63    2    0 |
     18. |     2015   1    1   63    2    1 |
         |----------------------------------|
     19. |     1989   1    1   63    3    0 |
     20. |     1991   1    1   63    3    0 |
     21. |     1993   1    1   58    3    0 |
     22. |     1997   0    1   58    3    0 |
     23. |     2000   1    0   58    3    0 |
     24. |     2004   1    0   58    3    0 |
     25. |     2006   1    1   58    3    0 |
     26. |     2011   1    1   56    3    0 |
     27. |     2015   0    0   56    3    0 |
         |----------------------------------|
     28. |     1989   1    1   56    4    0 |
     29. |     1991   1    1   56    4    0 |
     30. |     1993   1    0   56    4    0 |
     31. |     1997   1    1   56    4    0 |
     32. |     2000   0    0   56    4    0 |
     33. |     2004   0    0   56    4    0 |
     34. |     2006   0    0   58    4    0 |
     35. |     2011   0    1   58    4    0 |
     36. |     2015   0    1   58    4    0 |
         +----------------------------------+
    I'm guessing "studytim" is actually calendar year (not "elapsed duration" as in the cancer dataset), and "y" is a binary indicator summarising whether unit is in a state of some kind with "1" corresponding to in state in the relevant year, "0" meaning not in the state in the relevant year. A sequence of 1s (0s) defines the spells. Person 3 has 2 spells (sequences of 1s_) according to this conjecture. So, have a look at e.g.

    Code:
    SJ-15-1 dm0079  . . . . . . . . . . . . . . .  Stata tip 123: Spell boundaries
            . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
            Q1/15   SJ 15(1):319--323                                (no commands)
            shows how to identify spells
    and also ssc describe spell for a program by Nick Cox that is helpful in this regard too. (I recall there is another related utility by Nick but I've forgotten its name.)

    Only when you've defined the spells, can you begin to define the elapsed duration and event variables ... and think how to model spell lengths


    Comment


    • #3
      PS The updated version I was trying to recall is -tsspell-, on SSC

      Comment


      • #4
        Dear professor Jenkins,It's very kind of you. I really appreciate your answer to my question.
        what you guess " "studytim" is actually calendar year (not "elapsed duration" as in the cancer dataset), and "y" is a binary indicator summarising whether unit is in a state of some kind with "1" corresponding to in state in the relevant year, "0" meaning not in the state in the relevant year." is right . My raw data is like this.

        According to your suggestion, I tried to run it like this. I wonder if I did it right.
        Code:
        input float(studytim y x1 x2 id )
        studytim y x1 x2 id
        1989 0 1 61 1
        1989 1 1 56 2
        1989 1 1 63 3
        1989 1 1 56 4
        1991 0 1 65 1
        1991 1 0 56 2
        1991 1 1 63 3
        1991 1 1 56 4
        1993 0 0 59 1
        1993 1 0 67 2
        1993 1 1 58 3
        1993 1 0 56 4
        1997 0 0 59 1
        1997 1 0 67 2
        1997 0 1 58 3
        1997 1 1 56 4
        2000 0 1 52 1
        2000 1 1 67 2
        2000 1 0 58 3
        2000 0 0 56 4
        2004 0 0 52 1
        2004 1 1 67 2
        2004 1 0 58 3
        2004 0 0 56 4
        2006 0 1 52 1
        2006 1 1 63 2
        2006 1 1 58 3
        2006 0 0 58 4
        2011 0 0 56 1
        2011 1 0 63 2
        2011 1 1 56 3
        2011 0 1 58 4
        2015 0 1 56 1
        2015 1 1 63 2
        2015 0 0 56 3
        2015 0 1 58 4
        end
        sort id studytim
        tsset id studytim
        tsspell, cond(y == 1)
        cloglog _end x1 x2 ,r nolog

        After tsspell ,It it seems that the variable _end is what we want to generate the created "the binary dependent variable".
        Can you help me take a look again?

        Comment

        Working...
        X