Hi,
I have a bunch of data with a bunch of problems. Each person in our data sets should have one ID number, but this is not the case. Some people have up to 5+ ID numbers. The problem that we are running into is that in different data sets, different ID numbers were used. For example one person might have ID numbers 1, 2, 3, and 4. In one data set they may be listed as "1", in another data set they may be listed as "2," and so on. We are in the process of finding all the duplicate ID numbers and making a "master ID number" for each person, but I don't know how to combine all the data once we get to that point.
For most of the data we are concerned with capturing, it is in a "yes/no" format. So following the example above, if there is a "yes" under ANY of a given person's ID numbers for a certain variable, we want the master ID number to say "yes" for the variable.
Any one have any ideas? Feel free to ask me more questions if this isn't clear.
Thanks,
Alyssa
I have a bunch of data with a bunch of problems. Each person in our data sets should have one ID number, but this is not the case. Some people have up to 5+ ID numbers. The problem that we are running into is that in different data sets, different ID numbers were used. For example one person might have ID numbers 1, 2, 3, and 4. In one data set they may be listed as "1", in another data set they may be listed as "2," and so on. We are in the process of finding all the duplicate ID numbers and making a "master ID number" for each person, but I don't know how to combine all the data once we get to that point.
For most of the data we are concerned with capturing, it is in a "yes/no" format. So following the example above, if there is a "yes" under ANY of a given person's ID numbers for a certain variable, we want the master ID number to say "yes" for the variable.
Any one have any ideas? Feel free to ask me more questions if this isn't clear.
Thanks,
Alyssa

Comment