I have a dataset that includes one yearly observation per individual (1994-2023). Individuals can exit the dataset (death or other reasons) and a cut off date (July 1st) is used to decide what year that happens. If they exit then they have an 'exitdate'. I want to identify when individuals will exit the dataset and tag them for removal in the appropiate year.
In the example below i show data for 2 individuals, with a record for each year and an exitdate. ID 8 will exit the dataset in 2019 as their exitdate is before July 1st of 2019. ID 13 will exit in 2010 as their exit date is after July 1st of 2009.
I have used the following code to identify when each individual should exit.
My question is how can i do this more efficiently, perhaps in a loop?
Thank you
In the example below i show data for 2 individuals, with a record for each year and an exitdate. ID 8 will exit the dataset in 2019 as their exitdate is before July 1st of 2019. ID 13 will exit in 2010 as their exit date is after July 1st of 2009.
Code:
* Example generated by -dataex-. For more info, type help dataex clear input long akdnid int year float exitdate 8 2008 21617 8 2009 21617 8 2010 21617 8 2011 21617 8 2012 21617 8 2013 21617 8 2014 21617 8 2015 21617 8 2016 21617 8 2017 21617 8 2018 21617 8 2019 21617 13 2008 18100 13 2009 18100 13 2010 18100 end format %td exitdate
I have used the following code to identify when each individual should exit.
Code:
gen year_cut = (mdy(07,01,2008)) if year ==2008 replace year_cut = (mdy(07,01,2009)) if year ==2009 replace year_cut = (mdy(07,01,2010)) if year ==2010 replace year_cut = (mdy(07,01,2011)) if year ==2011 replace year_cut = (mdy(07,01,2012)) if year ==2012 replace year_cut = (mdy(07,01,2013)) if year ==2013 replace year_cut = (mdy(07,01,2014)) if year ==2014 replace year_cut = (mdy(07,01,2015)) if year ==2015 replace year_cut = (mdy(07,01,2016)) if year ==2016 replace year_cut = (mdy(07,01,2017)) if year ==2017 replace year_cut = (mdy(07,01,2018)) if year ==2018 replace year_cut = (mdy(07,01,2019)) if year ==2019 replace year_cut = (mdy(07,01,2020)) if year ==2020 replace year_cut = (mdy(07,01,2021)) if year ==2021 replace year_cut = (mdy(07,01,2022)) if year ==2022 format year_cut %td gen exit = 1 if exitdate<year_cut replace exit = 0 if exitdate>year_cut & exitdate !=. drop if exit ==1
Thank you
Comment