Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bysort command to drop observations

    Hello,

    I am using bysort command in Stata 18.0. My data set looks like this:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(Family position_in_family age)
    1 1 30
    1 2 30
    1 3  5
    1 4 24
    2 1 37
    2 2 35
    2 4 44
    2 5 55
    3 1 45
    4 1 44
    4 2 44
    4 3 18
    5 1 34
    5 2 45
    5 3  4
    5 4 34
    6 1 54
    6 2 45
    6 3 21
    end

    Here, I have family data. The position in family implies, 1 = husband, 2= wife, 3 = their kid, 4,5 = other family members present in the household.

    I have the ages of the family members.

    Now I want to drop those families where the age of the kid is above 6 years old. I am applying the bysort command.

    Code:
    bysort family(position): drop if age[3] > 6 & age[3] < 71
    Please note, here 70 is the highest age of the child in my data set.

    I am not sure why this code isn't working. I still see some family where the kid is much older than 6.

    I would really appreciate if anyone can help!


  • #2
    Code:
    bysort family: egen tag= max(position_in_family==3 & age>6)
    drop if tag
    Note that the above says that the individual is a kid (position_in_family==3) and age>6. This will also drop families that have at least one kid with a missing observation on age. If you do not want this:

    Code:
    bysort family: egen tag= max(position_in_family==3 & age>6 & age<.)
    drop if tag
    The logic of using the -max()- function of egen is explained in https://www.stata.com/support/faqs/d...ble-recording/.

    Comment


    • #3
      Dear Andrew Musau,

      Thank you so much for the code. It worked perfectly.
      Also thank you for sharing the reading!

      Comment

      Working...
      X