survival analysis about cloglog model

Xiaopei Cheng

Join Date: Feb 2023

Posts: 9
#1

survival analysis about cloglog model

28 Feb 2023, 02:36

I've been studying the cloglog model recently, and I've learned some materials from Jenkins-lesson6-disrcrete time model .
About data reorganisation ," The binary dependent variable also needs to be created. If subject i’s survival time is censored, the binary dependent variable is equal to 0 for all of i’s spell months; if subject i’s survival time is not censored, the binary dependent variable is equal to 0 for all but the last of i’s spell months (month 1,..., Ti–1) and equal to 1 for the last month (month Ti)."
As a beginner, I have a little question.

If the data is like this, what should I do?
studytim is the survey date,y is the failure event,x1 and x2 is explanatory variable.

* Example generated by -dataex-. For more info, type help dataex
clear
input float(studytim y x1 x2 id y1)
1989 0 1 61 1 0
1989 1 1 56 2 0
1989 1 1 63 3 0
1989 1 1 56 4 0
1991 0 1 65 1 0
1991 1 0 56 2 0
1991 1 1 63 3 0
1991 1 1 56 4 0
1993 0 0 59 1 0
1993 1 0 67 2 0
1993 1 1 58 3 0
1993 1 0 56 4 0
1997 0 0 59 1 0
1997 1 0 67 2 0
1997 0 1 58 3 0
1997 1 1 56 4 0
2000 0 1 52 1 0
2000 1 1 67 2 0
2000 1 0 58 3 0
2000 0 0 56 4 0
2004 0 0 52 1 0
2004 1 1 67 2 0
2004 1 0 58 3 0
2004 0 0 56 4 0
2006 0 1 52 1 0
2006 1 1 63 2 0
2006 1 1 58 3 0
2006 0 0 58 4 0
2011 0 0 56 1 0
2011 1 0 63 2 0
2011 1 1 56 3 0
2011 0 1 58 4 0
2015 0 1 56 1 0
2015 1 1 63 2 1
2015 0 0 56 3 0
2015 0 1 58 4 0
end
sort id studytim
bysort id: ge y1= y == 1 & _n==_N //it seems tha there are some mistakes about the new variables "y1"
cloglog y1 x1 x2, r nolog
***another model
xtset studytim id
xtcloglog dead drug age, pa nolog

[/CODE]
------------------ copy up to and including the previous line ------------------

Question :Observe 3 individuals and 4 individuals,the new variable y1 is wrong .
How should I solve it? I would appreciate it if someone helped me.(Jenkins's lesson introduced the single-spell data, but how to operate if the event enters again after exiting?)
Tags: cloglog model, discrete time model, panel data, survival analysis

Stephen Jenkins

Join Date: Apr 2014
Posts: 1435

28 Feb 2023, 11:33

Here is what the data look like after your -sort id studytim- line:

Code:

. list, sepby(id)

     +----------------------------------+
     | studytim   y   x1   x2   id   y1 |
     |----------------------------------|
  1. |     1989   0    1   61    1    0 |
  2. |     1991   0    1   65    1    0 |
  3. |     1993   0    0   59    1    0 |
  4. |     1997   0    0   59    1    0 |
  5. |     2000   0    1   52    1    0 |
  6. |     2004   0    0   52    1    0 |
  7. |     2006   0    1   52    1    0 |
  8. |     2011   0    0   56    1    0 |
  9. |     2015   0    1   56    1    0 |
     |----------------------------------|
 10. |     1989   1    1   56    2    0 |
 11. |     1991   1    0   56    2    0 |
 12. |     1993   1    0   67    2    0 |
 13. |     1997   1    0   67    2    0 |
 14. |     2000   1    1   67    2    0 |
 15. |     2004   1    1   67    2    0 |
 16. |     2006   1    1   63    2    0 |
 17. |     2011   1    0   63    2    0 |
 18. |     2015   1    1   63    2    1 |
     |----------------------------------|
 19. |     1989   1    1   63    3    0 |
 20. |     1991   1    1   63    3    0 |
 21. |     1993   1    1   58    3    0 |
 22. |     1997   0    1   58    3    0 |
 23. |     2000   1    0   58    3    0 |
 24. |     2004   1    0   58    3    0 |
 25. |     2006   1    1   58    3    0 |
 26. |     2011   1    1   56    3    0 |
 27. |     2015   0    0   56    3    0 |
     |----------------------------------|
 28. |     1989   1    1   56    4    0 |
 29. |     1991   1    1   56    4    0 |
 30. |     1993   1    0   56    4    0 |
 31. |     1997   1    1   56    4    0 |
 32. |     2000   0    0   56    4    0 |
 33. |     2004   0    0   56    4    0 |
 34. |     2006   0    0   58    4    0 |
 35. |     2011   0    1   58    4    0 |
 36. |     2015   0    1   58    4    0 |
     +----------------------------------+

I'm guessing "studytim" is actually calendar year (not "elapsed duration" as in the cancer dataset), and "y" is a binary indicator summarising whether unit is in a state of some kind with "1" corresponding to in state in the relevant year, "0" meaning not in the state in the relevant year. A sequence of 1s (0s) defines the spells. Person 3 has 2 spells (sequences of 1s_) according to this conjecture. So, have a look at e.g.

Code:

SJ-15-1 dm0079  . . . . . . . . . . . . . . .  Stata tip 123: Spell boundaries
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        Q1/15   SJ 15(1):319--323                                (no commands)
        shows how to identify spells

and also ssc describe spell for a program by Nick Cox that is helpful in this regard too. (I recall there is another related utility by Nick but I've forgotten its name.)

Only when you've defined the spells, can you begin to define the elapsed duration and event variables ... and think how to model spell lengths

Comment

Stephen Jenkins

Join Date: Apr 2014

Posts: 1435
#3

01 Mar 2023, 02:51

PS The updated version I was trying to recall is -tsspell-, on SSC
Comment
Xiaopei Cheng

Join Date: Feb 2023

Posts: 9
#4

01 Mar 2023, 03:21

Dear professor Jenkins，It's very kind of you. I really appreciate your answer to my question.
what you guess " "studytim" is actually calendar year (not "elapsed duration" as in the cancer dataset), and "y" is a binary indicator summarising whether unit is in a state of some kind with "1" corresponding to in state in the relevant year, "0" meaning not in the state in the relevant year." is right . My raw data is like this.

According to your suggestion, I tried to run it like this. I wonder if I did it right.
Code:
input float(studytim y x1 x2 id )
studytim y x1 x2 id
1989 0 1 61 1
1989 1 1 56 2
1989 1 1 63 3
1989 1 1 56 4
1991 0 1 65 1
1991 1 0 56 2
1991 1 1 63 3
1991 1 1 56 4
1993 0 0 59 1
1993 1 0 67 2
1993 1 1 58 3
1993 1 0 56 4
1997 0 0 59 1
1997 1 0 67 2
1997 0 1 58 3
1997 1 1 56 4
2000 0 1 52 1
2000 1 1 67 2
2000 1 0 58 3
2000 0 0 56 4
2004 0 0 52 1
2004 1 1 67 2
2004 1 0 58 3
2004 0 0 56 4
2006 0 1 52 1
2006 1 1 63 2
2006 1 1 58 3
2006 0 0 58 4
2011 0 0 56 1
2011 1 0 63 2
2011 1 1 56 3
2011 0 1 58 4
2015 0 1 56 1
2015 1 1 63 2
2015 0 0 56 3
2015 0 1 58 4
end
sort id studytim
tsset id studytim
tsspell, cond(y == 1)
cloglog _end x1 x2 ,r nolog

After tsspell ,It it seems that the variable _end is what we want to generate the created "the binary dependent variable".
Can you help me take a look again?
Comment

Announcement