Dear Statalist subscribers,
I am trying to use the "strs" command to estimate relative survival. To start off, I first used the command with the "colon.dta" sample dataset, merging with the popmort data (this is the example provided in Stata). This worked perfectly fine. However, when I then tried to implement this command with my own data, I keep getting this error code:
3638 records fail to match with the population file (popmort_CH_2014.dta).
That is, there are combinations of the mergeby() variables that do not exist in popmort_CH_2014.dta.
This will occur, for example, when patients are followed-up beyond
the last year for which population mortality data are available.
Records that did not match have been written to _merge_error_.dta).
When checking between the patient dataset, and comparing to the colon dataset, I don't see what I am doing wrong, as it doesn't seem that I have a follow-up period (I stop the study in 2011) beyond the last year for which the population mortality data are available (2014).
stset time_out, failure(vital_status=1) id(id) exit(time td(30sep2011))
strs using "popmort_CH_2014", breaks(0(1)5) mergeby(_year sex _age) by(sex)
Available variables in the patient file: id ; yydx ; age ; sex ; date_discharge ; vital_status ; time_out ; surv_yy ; _st ; _d ; _origin ; _t ; _t0
I am guessing that I am missing something obvious, but I would greatly appreciate any help!
Example observations from the practice dataset (colon.dta)
Example observations from the popmort dataset (popmort_CH_2014)
Example observations from the patient dataset
I am trying to use the "strs" command to estimate relative survival. To start off, I first used the command with the "colon.dta" sample dataset, merging with the popmort data (this is the example provided in Stata). This worked perfectly fine. However, when I then tried to implement this command with my own data, I keep getting this error code:
3638 records fail to match with the population file (popmort_CH_2014.dta).
That is, there are combinations of the mergeby() variables that do not exist in popmort_CH_2014.dta.
This will occur, for example, when patients are followed-up beyond
the last year for which population mortality data are available.
Records that did not match have been written to _merge_error_.dta).
When checking between the patient dataset, and comparing to the colon dataset, I don't see what I am doing wrong, as it doesn't seem that I have a follow-up period (I stop the study in 2011) beyond the last year for which the population mortality data are available (2014).
stset time_out, failure(vital_status=1) id(id) exit(time td(30sep2011))
strs using "popmort_CH_2014", breaks(0(1)5) mergeby(_year sex _age) by(sex)
Available variables in the patient file: id ; yydx ; age ; sex ; date_discharge ; vital_status ; time_out ; surv_yy ; _st ; _d ; _origin ; _t ; _t0
I am guessing that I am missing something obvious, but I would greatly appreciate any help!
Example observations from the practice dataset (colon.dta)
sex | age | stage | mmdx | yydx | surv_mm | surv_yy | status | subsite | id | _st | _d | _t | _t0 |
Male | 72 | Localised | 2 | 1989 | 2 | 0.01 | Dead: other | Descending and sigmoid | 1 | 1 | 1 | 0.01 | |
Female | 82 | Distant | 12 | 1991 | 2 | 0.01 | Dead: cancer | Descending and sigmoid | 2 | 1 | 1 | 0.01 |
Example observations from the popmort dataset (popmort_CH_2014)
_year | _age | sex | rate | prob |
1950 | 16 | 1 | 0.001167 | 0.9988337 |
1950 | 17 | 1 | 0.001183 | 0.9988177 |
Example observations from the patient dataset
id | yydx | age | sex | date_dis | vital_status | time_out | surv_yy | _st | _d | _origin | _t | _t0 |
77885 | 2005 | 55 | Female | 31. Jan 07 | 1 | 21. Nov 09 | 2.806374 | 1 | 1 | 17197 | 2.806297 | |
77886 | 2009 | 70 | Male | 30. Sep 09 | 30. Sep 11 | 1.998686 | 1 | 18170 | 1.998631 |
Comment