Hi there, I am currently using a large database with over 1 million observation.
I am trying to plot a KM curve for different surgeries. I can not give you the exact data as the data is stored on a remote platform with no internet ! <sigh> So in order to understand what's going on I created a smaller dataset to see what's going on with data and ensure I'm using the right code. I am sure some people will complain about this, but there is nothing I can do except complain to the government who won't give me internet on our remote system - so sorry about this !
Here's the sample data
Now I've attempted to create two KM curves using this code
yearofsurgery = date when surgery took place
yearofrevision = date revision of surgery took place
type = type of surgery performed
Now my sample data create creates a nice KM curve which looks just right as seen below

Now when I replicated the code on my remotedata set I got these curves (the lines are really close together) .

Question 1: How can I address the problem to separate out the KM line graphs ?
In fact when I then plot a cum hazard plot , the curves separate out

Question 2:
In my current sample data provided above when I use
I get missing values for 50% and 75% . Why does this happen?
I am trying to plot a KM curve for different surgeries. I can not give you the exact data as the data is stored on a remote platform with no internet ! <sigh> So in order to understand what's going on I created a smaller dataset to see what's going on with data and ensure I'm using the right code. I am sure some people will complain about this, but there is nothing I can do except complain to the government who won't give me internet on our remote system - so sorry about this !
Here's the sample data
Code:
* Example generated by -dataex-. For more info, type help dataex clear input float(type revision yearofsurgery yearofrevision timetorevision_years) byte(_st _d) double _t byte _t0 1 1 14610 16438 5.008219 1 0 3 0 1 0 15310 . . 0 . . . 0 0 16468 . . 0 . . . 1 1 17867 18263 1.0849315 1 1 1.084931492805481 0 1 0 17932 . . 0 . . . 1 1 18298 19422 3.079452 1 0 3 0 0 1 19029 20794 4.835617 1 0 3 0 0 0 16109 . . 0 . . . 0 1 15745 16111 1.0027397 1 1 1.002739667892456 0 0 1 18303 20498 6.013699 1 0 3 0 end format %td yearofsurgery format %td yearofrevision label values type surgery label def surgery 0 "Sling", modify label def surgery 1 "Pessary", modify
yearofsurgery = date when surgery took place
yearofrevision = date revision of surgery took place
type = type of surgery performed
Code:
//Generate time variable from operation to revision gen timetorevision_years = (yearofrevision - yearofsurgery)/365 stset timetorevision_years, failure(revision=1) exit(time 3) sts graph, by(type) //Creates two KM curves for the different type of procedures stdescribe
Now my sample data create creates a nice KM curve which looks just right as seen below
Now when I replicated the code on my remotedata set I got these curves (the lines are really close together) .
Question 1: How can I address the problem to separate out the KM line graphs ?
In fact when I then plot a cum hazard plot , the curves separate out
Question 2:
In my current sample data provided above when I use
Code:
stsum, by(type)
Comment