Greetings,
I have a big dataset in which some of the observations have up to 17 variations; meaning their ID is the same but the operational codes are different. For a project, I classified more than 50 operational codes into three categories (MH = 0, 1, and 3). I wrote the following script to remove any duplicates (same ID) greater than 1 and MH codes 0 and 3. But when I tabulate the dup variable after dropping duplicates, I still have lots of duplicate observations that I don't need. I appreciate your kind advice, please.
drop if (dup > 1 & (MH == 0 | MH == 3))
Thanks,
Eliot
I have a big dataset in which some of the observations have up to 17 variations; meaning their ID is the same but the operational codes are different. For a project, I classified more than 50 operational codes into three categories (MH = 0, 1, and 3). I wrote the following script to remove any duplicates (same ID) greater than 1 and MH codes 0 and 3. But when I tabulate the dup variable after dropping duplicates, I still have lots of duplicate observations that I don't need. I appreciate your kind advice, please.
drop if (dup > 1 & (MH == 0 | MH == 3))
Thanks,
Eliot
Comment