Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • correct median follow up in stset?

    Hello all,
    I am confused as to which is giving me the correct median follow up time in survival analysis- stsum or stdescribe.

    . stdescribe

    failure _d: recur2 == 1
    analysis time _t: betweendx2recur/30

    |-------------- per subject --------------|
    Category total mean min median max
    ------------------------------------------------------------------------------
    no. of subjects 175
    no. of records 175 1 1 1 1

    (first) entry time 0 0 0 0
    (final) exit time 21.3101 .4333333 17.03333 93.33333

    subjects with gap 0
    time on gap if gap 0
    time at risk 3729.2667 21.3101 .4333333 17.03333 93.33333

    failures 120 .6857143 0 1 1
    ------------------------------------------------------------------------------

    . stsum

    failure _d: recur2 == 1
    analysis time _t: betweendx2recur/30

    | incidence no. of |------ Survival time -----|
    | time at risk rate subjects 25% 50% 75%
    ---------+---------------------------------------------------------------------
    total | 3729.266667 .0321779 175 13.33333 18.66667 26.9


    They give me different median numbers- I am thinking stsum since stdescribe seems to be looking at the exit times- any insight? Thanks!

  • #2
    I am sympathetic to your cause as this is at least the third time you post the same question with no replies. First, I do not do survival analysis, but here is what I get from a quick read of the manual. For stsum

    Methods and formulas The 25th, 50th, and 75th percentiles of survival times are obtained from S(t), the Kaplan –Meier product-limit estimate of the survivor function. The 25th percentile, for instance, is obtained as the minimum value of t such that S(t) ≤ 0.75.
    The results of stdescribe on the other hand can be obtained from the summarize command. Based on some experimentation, with single-record survival data (number of records= number of subjects) , both stsum and stdescribe will give you approximately the same median value.

    Code:
    webuse page2, clear
    stdescribe
    stsum
    Res.:

    Code:
    . stdescribe
    
             failure _d:  dead
       analysis time _t:  time
    
                                       |-------------- per subject --------------|
    Category                   total        mean         min     median        max
    ------------------------------------------------------------------------------
    no. of subjects               40   
    no. of records                40           1           1          1          1
    
    (first) entry time                         0           0          0          0
    (final) exit time                     227.95         142        231        344
    
    subjects with gap              0   
    time on gap if gap             0   
    time at risk                9118      227.95         142        231        344
    
    failures                      36          .9           0          1          1
    ------------------------------------------------------------------------------
    
    . 
    . stsum
    
             failure _d:  dead
       analysis time _t:  time
    
             |               Incidence     Number of   |------ Survival time -----|
             | Time at risk       rate      subjects        25%       50%       75%
    ---------+---------------------------------------------------------------------
       Total |        9,118   .0039482            40        198       232       261
    However, with multiple-record survival data (number of records ≠ number of subjects), you get different results.

    Code:
    webuse stan3, clear
    stsum 
    stdescribe
    Code:
    . stsum 
    
             failure _d:  died
       analysis time _t:  t1
                     id:  id
    
             |               Incidence     Number of   |------ Survival time -----|
             | Time at risk       rate      subjects        25%       50%       75%
    ---------+---------------------------------------------------------------------
       Total |     31,938.1   .0023483           103         36       100       979
    
    . 
    . stdescribe
    
             failure _d:  died
       analysis time _t:  t1
                     id:  id
    
                                       |-------------- per subject --------------|
    Category                   total        mean         min     median        max
    ------------------------------------------------------------------------------
    no. of subjects              103   
    no. of records               172    1.669903           1          2          2
    
    (first) entry time                         0           0          0          0
    (final) exit time                   310.0786           1         90       1799
    
    subjects with gap              0   
    time on gap if gap             0           .           .          .          .
    time at risk             31938.1    310.0786           1         90       1799
    
    failures                      75    .7281553           0          1          1
    ------------------------------------------------------------------------------
    I have no clue how things look like for multiple failure data and will not attempt to find out. As advertised above, here I show how you can reproduce the result of stdescribe using summarize. Since the data is per subject, we can use the contract command to obtain one record per subject.

    Code:
    webuse stan3, clear
    contract id stime
    sum stime, d
    Code:
    . sum stime, d
    
                        Survival time (Days)
    -------------------------------------------------------------
          Percentiles      Smallest
     1%            2              1
     5%            3              2
    10%            6              2       Obs                 103
    25%           32              2       Sum of Wgt.         103
    
    50%           90                      Mean           310.0777
                            Largest       Std. Dev.      427.9682
    75%          427           1407
    90%          979           1571       Variance       183156.8
    95%         1386           1586       Skewness       1.725621
    99%         1586           1799       Kurtosis       5.129539
    If you ask for my opinion, I would favor reporting stsum's results derived from the survivor function, but I am no expert here.

    Comment

    Working...
    X