Hi, I am trying to simulate some data to work on K-M survival curves. Below, I'm simulating data on IPV incidence for a sample of adults, starting at age 15.
In plot 1, the survival curve looks as expected (starting on the y axis at 1 and gradually curving downwards with a flattening at older ages), but the X axis starts at 0, which I believe is representing time since the start of the analytic window?
In plot 2, I'm trying to adjust the x axis labels such that it starts at age 16 (which is what I've attempted to set as the entry window). However, the curve seems to not start at 1 on the y axis when I do this, and I don't quite understand why.
Is there a better solution for adjusting the x axis range/labels, or am I misunderstanding something about the data or sts graph?
In plot 1, the survival curve looks as expected (starting on the y axis at 1 and gradually curving downwards with a flattening at older ages), but the X axis starts at 0, which I believe is representing time since the start of the analytic window?
In plot 2, I'm trying to adjust the x axis labels such that it starts at age 16 (which is what I've attempted to set as the entry window). However, the curve seems to not start at 1 on the y axis when I do this, and I don't quite understand why.
Is there a better solution for adjusting the x axis range/labels, or am I misunderstanding something about the data or sts graph?
Code:
//// Simulate data on 20,000 adults ages 15-66 clear set seed 123 set obs 20000 gen person_id = _n gen age_in_2024 = 15 + int(51*runiform()) gen female = runiform() < 0.53 // Create IPV vars gen ipv_prob = . * For females: 40% prevalence ages 16-29, then decline replace ipv_prob = 0.40 if female == 1 & age_in_2024 >= 16 & age_in_2024 <= 29 replace ipv_prob = 0.40 - (0.40-0.15)*(age_in_2024-29)/(66-29) if female == 1 & age_in_2024 > 29 * For males: 20% prevalence ages 16-29, then decline replace ipv_prob = 0.20 if female == 0 & age_in_2024 >= 16 & age_in_2024 <= 29 replace ipv_prob = 0.20 - (0.20-0.08)*(age_in_2024-29)/(66-29) if female == 0 & age_in_2024 > 29 * Generate IPV event based on age-dependent probability gen ipv_event = runiform() < ipv_prob * Simulate age at first IPV for those who experience IPV gen age_first_ipv = . replace age_first_ipv = 16 + int((age_in_2024 - 16)*runiform()) if ipv_event == 1 * If no IPV, set age_first_ipv to current age (right-censor) replace age_first_ipv = age_in_2024 if ipv_event == 0 // Set observation window starting at age 16 gen age_start = 16 gen age_end = age_first_ipv // Declare survival time starting at minimum age (16) stset age_end, failure(ipv_event == 1) origin(time age_start) id(person_id) **** PLOT V1 sts graph, survival ytitle("Survival Probability") /// title("Kaplan-Meier Survival Curve") **** PLOT V2 sts graph, survival ytitle("Survival Probability") /// title("Kaplan-Meier Survival Curve") xlabel(16(5)50) tmin(16)