I need some help to create a unique case id per patient. My dataset have duplicate records per patient. I am trying to identify unique patients by patient health card number (HC) and if HC number is missing by patient's last name & first_name & birth_date. In the dataset, I may have inconsistent entry of HC numbers meaning that some patients maybe recorded with complete information (HC, first_name, last_name and date of birth) in one case and in another instance without HC information. Similar situations may ocurr with various errors in first name or last name or date of birth. I want to create a unique patient id that accounts for all of these issues. I will provide a dataset to illustrate my situation. hc last_name first_name birth_date case_id
-------------------------------------------------------------------------
1. 333 Smith John Jan 1 2007 1
2. . Smith John Jan 1 2007 1
3. 547 Marek Rob Jun 12 2007 2
4. 487 Red Flower May 2 2002 3
5. 487 Red White(Flower) May 2 2002 3
-------------------------------------------------------------------------
6. 333 Smith Jon Jan 1 2007 1
+------------------------------------------------------------------------+
Thank you in advance for all your time and efforts.
Adriana Peci
-------------------------------------------------------------------------
1. 333 Smith John Jan 1 2007 1
2. . Smith John Jan 1 2007 1
3. 547 Marek Rob Jun 12 2007 2
4. 487 Red Flower May 2 2002 3
5. 487 Red White(Flower) May 2 2002 3
-------------------------------------------------------------------------
6. 333 Smith Jon Jan 1 2007 1
+------------------------------------------------------------------------+
Thank you in advance for all your time and efforts.
Adriana Peci
Comment