Hi everyone, so I want to make a balanced panel data set. I have baseline and endline data. To keep it balanced, after appending the endline data, I only want to keep those observations which also have baseline responses. This is a household level data. There is a unique ID for each respondent in the baseline survery which has no duplicates. In contrast, household Id has duplicates as we have some respondents under the same household.
I was thinking that after appending endline, I check for duplicates for the unique ID (which exists once in baseline and endline each) and drop those that do not have any copies (meaning they are not part of both baseline and endline). Is this a viable way? If so, I know the duplicates command but I do not know how to keep those that only have one duplicate copy. I appreciate your help in this!
I was thinking that after appending endline, I check for duplicates for the unique ID (which exists once in baseline and endline each) and drop those that do not have any copies (meaning they are not part of both baseline and endline). Is this a viable way? If so, I know the duplicates command but I do not know how to keep those that only have one duplicate copy. I appreciate your help in this!
Comment