Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Kaplan Meier Survival

    Hey,

    I am doing a project where i need to assess the survival of Renal Cancer patients on/off a particular drug type. i am aware i need to do a Kaplan Meier analysis and have manipulated my data into three columns: censored (alive)/uncensored (Dead), days alive since diagnosis and group (on/off the drug). i have stata and have NO idea how to make the Kaplan-Meier curve. i have never used this software before and would dearly appreciate any help. i can upload the file if needs be.

    thanks!

    simon

  • #2
    See the [st] manual, in particular stset and sts graph

    Comment


    • #3
      We definitely need to see your Stata data set to see how the data are set out. Don't upload it, though. Install the -dataex- command (-ssc install dataex-) if you don't already have it. And use that to post an example of your data.

      Comment


      • #4
        Simon,

        Christophe and Clyde give good advice. I will add the following general template:

        Code:
        stset time, failure(censor==0)
        sts graph, by(group)
        where "time" is the name of your time variable, "censor" is the name of your censoring variable (assumed to be 1 if censored, 0 if uncensored), and "group" is the name of your treatment variable. Note that if your censoring variable is reverse coded (1 for uncensored/failed, 0 for censored/survived), you can use failure(censor), as the option assumes failure when the variable is equal to 1.

        Other commands that you can use after stset include sts list (for a table of survival probabilities) and sts test (to test for differences between groups).

        Regards,
        Joe

        Comment


        • #5
          Simon:
          welcome to the list.
          As an aside to previous helpful advice, I would recommend you the following textbook: http://www.stata.com/bookstore/survi...-introduction/
          Kind regards,
          Carlo
          (Stata 15.1 SE)

          Comment


          • #6
            Hi and thanks for both question and answers.
            I have kind of the same problem. I am using 14.2 and have set my failure variable to Pathology yes/no (0 1)
            I have 348 patients and only 130 failures.
            When I type - sts graph, by (group) ci risktable - I can follow the survival curve to zero. And the censored patients ( the ones without Pathology==1) are missing from the survival curve.

            I have been over and over the PDF STATA help, help stset - and seen on STATA YouTube several times.

            can anyone tell me what I am doing wrong?
            I would be so grateful since I really don't know what I can do from here.

            Comment


            • #7
              Ditte, as others requested, we are trying to help you, but we cannot help you without a better description of what you are working with.

              In a post on the Mata forum (I redirected you here), you did post:

              Code:
              stset obs_time_m12_m4 if m12_FinishedinProject ==1, failure( PathologyFindings_m4_tumor )scale(365.25)
              
              failure event: PathologyFindings_m4_tumor != 0 & PathologyFindings_m4_tumor < .
              obs. time interval: (0, obs_time_m12_m4]
              exit on or before: failure
              t for analysis: time/365.25
              if exp: m12_FinishedinProject ==1
              
              
              699 total observations
              351 ignored at outset because of -if <exp>-
              
              348 observations remaining, representing
              130 failures in single-record/single-failure data
              254.294 total analysis time at risk and under observation
              at risk from t = 0
              earliest observed entry t = 0
              last observed exit t = 1.336071
              
              
              then; 
              sts graph, by (PatientProtocol_n) ci risktable
              Unfortunately, this doesn't tell us enough without knowing how you -stset- the data, or without an example of the data. Don't forget to remove all identifying information if you post an example of data.
              Please use the code delimiters to show code and results - use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

              Please use the command -dataex- to show a representative sample of data; it is installed already if you have Stata 14.2 or 15.1, else you can install it by typing

              Code:
              ssc install dataex

              Comment


              • #8
                Ditte:
                as an aside to Weiven's helpful comment, please note that in -sts graph- you can highlight consored observations with hash marks via the -Plot censoring, entries, etc...- option.
                Kind regards,
                Carlo
                (Stata 15.1 SE)

                Comment


                • #9
                  OK, thanks.
                  I will try my best to give further details on my data since I am both a novice to STATA but also in writing in this forum. Thank you all for your patience.

                  My time variable is created from the date of the latest operation ( Patient_DateOfLatestTURB) and until patient last visit ( m12_visitDate) it is in mm/dd/yyyy
                  The reason for "351 missing values generated" is that not all of the patients have FiniishedInProject==1, as in yes 1 and ==0, no
                  Date of randomization = m4_visitDate into control group and intervention group

                  . generate time_obs_all if m12_FinishedinProject==1 = ( m12_visitDate - Patient_DateOfLatestTURB)/ 365.25
                  (351 missing values generated)

                  The patients are coming for a check up at following time = 4 months after their latest operation (called TURB) = m4_PathologyFIndings
                  8 months after = m8_PathologyFindings
                  & 12 months after = m12_PathologyFIndings
                  The findings at each visit, look like this:
                  1 tumor
                  2 normal
                  3 inflammation
                  4 other

                  I can show you:
                  . tab PatientProtocol_n m4_PathologyFindings if m12_FinishedinProject==1


                  Patient_Prot | m4_PathologyFindings
                  ocol | tumor normal inflammat other | Total
                  -------------+--------------------------------------------+----------
                  control | 68 13 13 3 | 97
                  intervention | 62 11 12 0 | 85
                  -------------+--------------------------------------------+----------
                  Total | 130 24 25 3 | 182

                  . tab PatientProtocol_n m8_PathologyFindings if m12_FinishedinProject==1

                  Patient_Prot | m8_PathologyFindings
                  ocol | tumor normal inflammat other | Total
                  -------------+--------------------------------------------+----------
                  control | 37 5 8 1 | 51
                  intervention | 30 3 3 2 | 38
                  -------------+--------------------------------------------+----------
                  Total | 67 8 11 3 | 89

                  . tab PatientProtocol_n m12_PathologyFindings if m12_FinishedinProject==1

                  Patient_Prot | m12_PathologyFindings
                  ocol | tumor normal inflammat other | Total
                  -------------+--------------------------------------------+----------
                  control | 58 5 6 1 | 70
                  intervention | 38 3 9 1 | 51
                  -------------+--------------------------------------------+----------
                  Total | 96 8 15 2 | 121


                  I would like to stset the data so I can see time to first recurrence, on a Kaplan Meier if possible.
                  . generate pathology_all = ( m4_PathologyFindings==1 | m8_PathologyFindings==1 | m12_PathologyFindings==1)

                  after that I - stset my data


                  . stset time_obs_all if m12_FinishedinProject ==1, id(Patient_ID_n) failure(pathology_all==1)

                  id: Patient_ID_n
                  failure event: pathology_all == 1
                  obs. time interval: (time_obs_all[_n-1], time_obs_all]
                  exit on or before: failure
                  if exp: m12_FinishedinProject ==1

                  ------------------------------------------------------------------------------
                  699 total observations
                  351 ignored at outset because of -if <exp>-
                  ------------------------------------------------------------------------------
                  348 observations remaining, representing
                  348 subjects
                  183 failures in single-failure-per-subject data
                  375.65 total analysis time at risk and under observation
                  at risk from t = 0
                  earliest observed entry t = 0
                  last observed exit t = 1.645448

                  . sts graph, by ( PatientProtocol_n ) ci risktable

                  failure _d: pathology_all == 1
                  analysis time _t: time_obs_all
                  id: Patient_ID_n

                  THEN
                  the graph is only presenting the ones with failure!
                  I wish I could show you, but I dont know how to attach the graph. The "upload attachment"button is not working for me.

                  hope this is helpful

                  I will be so grateful if someone can help me figuring this out.
                  Please bare with me, if the above is not enough - or too much..

                  Thanks,
                  Ditte





                  Comment


                  • #10
                    Ditte,

                    Earlier I was incorrect to say you didn't show your -stset- code. I got confused because you posted your reply on this thread ... which you are entitled to do, but it would probably be better to start a new one for clarity. Also, it helps us to read your code if you use the code delimiters, which enclose your code in a nice box like in my earlier post. Use the # button in the formatting toolbar.

                    I can't tell exactly why your -sts graph- command is showing only failures. But there are some things about your code that look like possible errors.

                    I think you're saying that patients get 3 visits at 4, 8, and 12 months after operation. You calculated the observation time, time_obs_all, as the time between the last operation and the 12-month visit date. You say that you have 348 subjects with valid data.

                    You then show cross-tabs of pathology findings at each visit. The denominators for each visit are 182, 89, and 121 patients. That's a total of 392, which implies that not many patients got more than one follow up, and that everyone's follow up time is different. Also, your -stset- output says there are 183 failures. From your crosstabs, I see 130, 67, and 96 patients had tumors found at each of the pathology visits respectively. That totals 293. That is a lot more than -stset- says. This could be true if patients were coming to multiple follow up visits, I guess. But it is a very confusing scheme of follow up. Please correct me if I have misunderstood your output.

                    Moreover, there's no indication that if someone had a tumor at, say, 4 months, you recoded their time variable to 4 months (or more precisely, the 4-month visit date minus the operation date). You would need to recode the observation time based on the earliest date where a tumor was detected for this to work. Can you give us a summary of observation time in code delimiters?

                    Code:
                    summarize time_obs_all if m12_FinishedinProject ==1, detail
                    Last, you say you coded survival time based on the date of the 12-month visit. But your tables above appear to indicate that only 1/3 of your sample got a 12-month pathology report. Does not every visit get a pathology report?

                    Your code to indicate if there was tumor pathology on any one visit is correct as far as it goes. I can't see any errors in your -stset- code, but it's been a while since I did survival analysis. If the upload attachment button on the forum isn't working, then there are free image hosting sites like www.imgur.com that will take graphs. I haven't seen anyone on the forum use these, but they are not prohibited by the FAQ.

                    Let's just focus on maybe seeing your graph and getting it to run. But, I think that the study has a lot of issues. For example, say one person's 4-month pathology report was normal, but their 8-month report had a tumor. Most properly, you know they developed a tumor between 4 and 8 months, but you don't know when. When you code the observation time based on the 8 month visit, Stata will treat their survival time as 8 months exactly. I believe this is interval censoring, and this may be more appropriate for discrete time survival.
                    Please use the code delimiters to show code and results - use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

                    Please use the command -dataex- to show a representative sample of data; it is installed already if you have Stata 14.2 or 15.1, else you can install it by typing

                    Code:
                    ssc install dataex

                    Comment


                    • #11
                      As you suggest, I will post a new one.

                      Comment

                      Working...
                      X