Balanced Panel

Mahir Labib

Join Date: Aug 2021

Posts: 4
#1

Balanced Panel

02 Nov 2021, 09:08

Hi everyone, so I want to make a balanced panel data set. I have baseline and endline data. To keep it balanced, after appending the endline data, I only want to keep those observations which also have baseline responses. This is a household level data. There is a unique ID for each respondent in the baseline survery which has no duplicates. In contrast, household Id has duplicates as we have some respondents under the same household.

I was thinking that after appending endline, I check for duplicates for the unique ID (which exists once in baseline and endline each) and drop those that do not have any copies (meaning they are not part of both baseline and endline). Is this a viable way? If so, I know the duplicates command but I do not know how to keep those that only have one duplicate copy. I appreciate your help in this!
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30357
#2

02 Nov 2021, 10:16

Code:

by unique_id_variable, sort: keep if _N == 2

will drop the people who have only baseline or only endline data and retain those who have both. Note, this assumes that there are no other time points preceding baseline, following afterline, or strictly between them.
Comment
Mahir Labib

Join Date: Aug 2021

Posts: 4
#3

02 Nov 2021, 22:19

Thank you so much! Appreciate it.
Comment

Announcement