Hi,
I have a large set of observations in a followup file, looking like this:
Since there is no date of when the malignancy was diagnosed, I would like to create a new variable with the date in between the two follow-ups where the malignancy first was noted, and remove the other irrelevant observations.
I have been using this script provided by the forum to choose the 2021-10-12 observation:
And would essentially like something similar but with the date in-between this followup and the previous one. Unfortunately, I'm not skilled enough myself and would appreciate any help I could get!
I have a large set of observations in a followup file, looking like this:
ID_CODE | MALIGNANCY | DATE OF SURGERY | DATE OF FOLLOW-UP |
B1020 | NO | 2019-09-22 | 2019-12-13 |
B1020 | NO | 2019-09-22 | 2020-08-17 |
B1020 | YES | 2019-09-22 | 2021-10-12 |
B1020 | YES | 2019-09-22 | 2022-06-30 |
I have been using this script provided by the forum to choose the 2021-10-12 observation:
Code:
keep if inlist(malignancy, "Y") bysort id_code: gen closest = abs(surg_date-followup_date) gsort closest sort trr_id_code by id_code(closest), sort: keep if _n == 1
Comment