Hi!
I have an issue with a piece of code which I received help on in an earlier post, but I have now identified a further issue. I have a follow-up dataset of patients after surgery. I want to investigate wether or not they develop malignancy after having had surgery. Patients are followed six months post-op, thereafter annually.
The data looks like this:
Since there are no date of diagnosis, I have used this code to randomize a date of diagnosis between last follow-up where there was no diagnosis and the follow-up where malignancy diagnosis was noted:
The issue is that some observations are left with negative num_days and therefor no date, probably because malignancy is noted on the first follow-up after surgery:
Is there any way to write the script in such way that if malignancy is noted first follow-up post surgery, it randomizes a date in between surgery and first-follow up? Extremely grateful for help!
I have an issue with a piece of code which I received help on in an earlier post, but I have now identified a further issue. I have a follow-up dataset of patients after surgery. I want to investigate wether or not they develop malignancy after having had surgery. Patients are followed six months post-op, thereafter annually.
The data looks like this:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str16 id_code float malignancy long surg_date float followup_date "B1020" . 17123 17754 "B1020" . 17123 17392 "B1020" . 17123 18149 "B1020" . 17123 18423 "B1020" . 17123 18819 "B1020" 1 17123 19244 "B1020" . 17123 19523 "B1020" . 17123 19988 "B1020" . 17123 20114 "B1020" . 16112 16592 "B1020" . 16112 17842 "B1020" . 16112 18201 "B1020" . 16112 18759 "B1020" . 16112 19147 "B1020" . 16112 19348 "B1020" . 16112 19371 "B1020" . 16112 20146 "B1020" . 16112 20529 "B1020" . 16112 20873 "B1020" . 16112 21314 "B1020" . 16112 21624 "B1020" . 16112 21838 "B1020" . 16112 17582 "B1020" . 16112 17123 "B1020" . 16112 16172 "B2012" . 17254 17223 "B2012" . 17254 18335 "B2012" . 17254 18719 "B2012" . 17254 19142 "B2012" . 17254 19416 "B2012" . 17254 19783 "B2012" . 17254 20117 "B2012" . 17254 20501 "B2012" . 17254 20832 "B2012" . 17254 17801 "B2012" . 17254 21321 "B2012" . 17254 21549 "B2012" . 17254 21999 "B2013" 1 14634 16767 "B2013" . 14634 16817 "B2013" . 14634 16585 "B2013" . 14634 17165 "B2010" . 15124 17271 "B2010" . 15124 17994 "B2010" . 15124 17593 "B2010" . 15124 18318 "B2010" . 15124 18706 "B2010" . 15124 19062 "B2010" . 15124 19416 "B2010" 1 15124 19789 "B2010" . 15124 20151 "B2010" . 15124 20523 "B2010" . 15124 20893 "B2010" . 15124 21271 "B2010" . 15124 21627 "B2010" . 15124 21998 "B2010" . 15124 16909 "B2010" . 15124 16583 "B2054" . 17348 17654 "B2054" . 17348 17843 "B2054" . 17348 18102 "B2054" . 17348 18548 "B2074" 1 17562 17849 "B2074" . 17562 18332 "B2074" . 17562 18474 "B2074" . 17562 18850 "B2074" . 17562 19237 "B2074" . 17562 19788 "B2074" . 17562 19863 "B2074" . 17562 20257 "B2074" . 17562 20742 "B2074" . 17562 20973 "B2074" . 17562 20249 "B2074" . 17562 22694 end format %td surg_date format %td followup_date
Code:
bys id_code (surg_date followup_date): gen cum_malignancy = sum(malignancy) gen num_days = (followup_date-followup_date[_n-1])-1 if cum_malignancy == 1 & cum_malignancy[_n-1] == 0 set seed 123 gen wanted = followup_date[_n-1] + runiformint(1,num_days) if cum_malignancy == 1 & cum_malignancy[_n-1] == 0 format %td wanted drop if !(cum_malignancy == 1 & cum_malignancy[_n-1] == 0)
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str16 id_code float malignancy long surg_date float(followup_date cum_malignancy num_days wanted) "B1020" 1 17123 19244 1 424 18980 "B2010" 1 15124 19789 1 372 19701 "B2013" 1 14634 16767 1 181 16635 "B2074" 1 17562 17849 1 -700 . end format %td surg_date format %td followup_date format %td wanted
Comment