Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problems with stset on long data

    Hi,

    I'm trying to stset survival data in long format. I used reshape to convert from wide to long:

    Code:
    . reshape long PartDat PartAg HADSAnxi HADSDepr, i(id) j(_j)
    (note: j = 2 3)
    
    Data                               wide   ->   long
    -----------------------------------------------------------------------------
    Number of obs.                    78962   ->  157924
    Number of variables                 712   ->     709
    j variable (2 values)                     ->   _j
    xij variables:
                          PartDat2 PartDat3   ->   PartDat
                            PartAg2 PartAg3   ->   PartAg
                        HADSAnxi2 HADSAnxi3   ->   HADSAnxi
                        HADSDepr2 HADSDepr3   ->   HADSDepr
    -----------------------------------------------------------------------------
    But when I'm trying to stset using id, I get an error due to "multiple records at same instant":

    Code:
    . stset enddate, id(id) failure(RegisStat==5) origin(time PartDat) 
    
                    id:  id
         failure event:  RegisStat == 5
    obs. time interval:  (enddate[_n-1], enddate]
     exit on or before:  failure
        t for analysis:  (time-origin)
                origin:  time PartDat
    
    ------------------------------------------------------------------------------
        157,924  total observations
        157,924  multiple records at same instant                   PROBABLE ERROR
                 (enddate[_n-1]==enddate)
    ------------------------------------------------------------------------------
              0  observations remaining, representing
              0  subjects
              0  failures in single-failure-per-subject data
              0  total analysis time at risk and under observation
                                                    at risk from t =         0
                                         earliest observed entry t =         .
                                              last observed exit t =         .
    When I look at the failure variable RegisStat (død=failure, bosatt=still alive), it looks that there might be an error in my reshape; for instance is individual 1 marked as dead (død) at both observation intervals:


    Code:
    . list id enddate RegisStat PartDat _t0 _t in 1/20
    
         +----------------------------------------------------+
         | id     enddate   Regi~tat     PartDat   _t0     _t |
         |----------------------------------------------------|
      1. |  1   15nov2015        Død   22 Oct 96     0   6963 |
      2. |  1   15nov2015        Død   05 Oct 06     0   3328 |
      3. |  2   15may2009        Død   27 Nov 95     0   4918 |
      4. |  2   15may2009        Død   07 May 07     0    739 |
      5. |  3   15aug2010        Død   15 Apr 97     0   4870 |
         |----------------------------------------------------|
      6. |  3   15aug2010        Død   23 May 08     0    814 |
      7. |  4   15jan2019     Bosatt   14 Sep 95     0   8524 |
      8. |  4   15jan2019     Bosatt           .     .      . |
      9. |  5   15jan2019     Bosatt   23 Jan 97     0   8027 |
     10. |  5   15jan2019     Bosatt           .     .      . |
         |----------------------------------------------------|
     11. |  6   15jun2003        Død   09 Oct 95     0   2806 |
     12. |  6   15jun2003        Død           .     .      . |
     13. |  7   15jan2019     Bosatt   31 May 96     0   8264 |
     14. |  7   15jan2019     Bosatt   03 May 07     0   4275 |
     15. |  8   15jan2019     Bosatt   27 May 97     0   7903 |
         |----------------------------------------------------|
     16. |  8   15jan2019     Bosatt   18 Apr 08     0   3924 |
     17. |  9   15jan2019     Bosatt   08 Jan 97     0   8042 |
     18. |  9   15jan2019     Bosatt   07 Jan 08     0   4026 |
     19. | 10   15jan2019     Bosatt   29 Nov 95     0   8448 |
     20. | 10   15jan2019     Bosatt   25 Jun 07     0   4222 |
         +----------------------------------------------------+
    I tried to include RegisStat in the reshape command, returning the following error message:

    Code:
    . reshape long PartDat PartAg HADSAnxi HADSDepr RegisStat, i(id) j(_j)
    variable _j contains all missing values
    r(498);
    When I remove id from stset, I do not get the same error, but the observation time is too short (last observed exit should be around 22 years):

    Code:
    . stset enddate, failure(RegisStat==5) origin(time PartDat) 
    
         failure event:  RegisStat == 5
    obs. time interval:  (origin, enddate]
     exit on or before:  failure
        t for analysis:  (time-origin)
                origin:  time PartDat
    
    ------------------------------------------------------------------------------
        157,924  total observations
         41,877  ignored because never entered
              6  observations end on or before enter()
    ------------------------------------------------------------------------------
        116,041  observations remaining, representing
         24,924  failures in single-record/single-failure data
      662179727  total analysis time at risk and under observation
                                                    at risk from t =         0
                                         earliest observed entry t =         0
                                              last observed exit t =     8,554

    Enddate is created this way:

    Code:
    -----------------------------------------------------------------------------------
    RegisStat                                                            Registerstatus
    -----------------------------------------------------------------------------------
    
                      type:  numeric (double)
                     label:  RegisStat
    
                     range:  [1,5]                        units:  1
             unique values:  3                        missing .:  0/157,924
    
                tabulation:  Freq.   Numeric  Label
                           117,678         1  Bosatt
                               730         3  Utvandret
                            39,516         5  Død
    Code:
    gen enddate = mdy(02,01,2019)
    format enddate %td
    label variable enddate "End of follow-up date"
    replace enddate = RegisStatDat if RegisStat==5

    Both reshape and stset have been working perfectly with wide data, using the same syntax.

    Including enddate:

    Code:
    (note: enddate2 not found)
    variable enddate already defined
    r(110);
    What do I do wrong? Could is be that I've misspecified the enddate variable? It is created this way:



  • #2
    I recommend "Joint modelling of longitudinal and survival data in Stata" by Michael J Crowther. It seems the following example from slide 17/18 show a stset command on data with similar structure:
    Code:
    clear
    
    * https://www.mjcrowther.co.uk/pdf/JM_course_lectures.pdf slide 17/18
    
    input id logb str20 trt time stime died
    4 .5877866     D-penicil     0           5.27051 1
    4 .4700036     D-penicil     .51473      5.27051 1
    4 .5306283     D-penicil     1.01851     5.27051 1
    4 1.163151     D-penicil     1.99595     5.27051 1
    4 1.308333     D-penicil     3.43336     5.27051 1
    4 1.386294     D-penicil     4.00285     5.27051 1
    4 1.667707     D-penicil     4.99398     5.27051 1
    end
    
    list , clean
    
    bysort id: gen start = time
    bysort  id: gen stop = start[_n+1]
    
    gen event = 0
    bysort id: replace stop = stime if _n==_N
    bysort id: replace event = died if _n==_N
     
    list id logb trt start stop event if id==4, table noobs sepby(id)
    
    stset stop, enter(start) failure(event=1) id(id)
    
    * END https://www.mjcrowther.co.uk/pdf/JM_course_lectures.pdf slide 17/18
    
    stdescribe, noshow
    stvary, noshow
    . Also, you might consider using attained age as the time-scale.
    Last edited by Bjarte Aagnes; 04 Nov 2019, 12:40.

    Comment


    • #3
      Thank you and tusen takk, Bjarte Aagnes! The slides you shared was exactly what I needed.

      Comment

      Working...
      X