Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Balanced Panel

    Hi everyone, so I want to make a balanced panel data set. I have baseline and endline data. To keep it balanced, after appending the endline data, I only want to keep those observations which also have baseline responses. This is a household level data. There is a unique ID for each respondent in the baseline survery which has no duplicates. In contrast, household Id has duplicates as we have some respondents under the same household.

    I was thinking that after appending endline, I check for duplicates for the unique ID (which exists once in baseline and endline each) and drop those that do not have any copies (meaning they are not part of both baseline and endline). Is this a viable way? If so, I know the duplicates command but I do not know how to keep those that only have one duplicate copy. I appreciate your help in this!

  • #2
    Code:
    by unique_id_variable, sort: keep if _N == 2
    will drop the people who have only baseline or only endline data and retain those who have both. Note, this assumes that there are no other time points preceding baseline, following afterline, or strictly between them.

    Comment


    • #3
      Thank you so much! Appreciate it.

      Comment

      Working...
      X