Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to generate a value for groups of distinct value of a variable and then drop subgroups based on the value generated?

    Hi! I have a dataset including ppl's information in the group of family (each observation has a pid variable for family id) and I want to drop all familie if there is no teenages in that family (var age<=18). in that family. Should I use for each and loop, if then how? Thx!

  • #2
    It is difficult to write code for imaginary data. Your data is not entirely imaginary as you have tried to describe it. But your description is seriously incomplete. In fact, any description in words will be seriously incomplete. When looking for help with code, use the -dataex- command to show example data. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.


    That said, I will venture a guess as to how it might look. No loops needed.

    Code:
    by family_id, sort: egen byte no_teens = min(age <= 18)
    drop if no_teens

    Comment


    • #3
      Yufei,

      1) There are a lot of examples on Statalist using household or family data. One easy way to search for it is to search for "hhid". But take a look at posts here, here and here. (The data below comes from the last link.

      Just to give you an example of what Clyde's code will look like in practice,

      Code:
      dataex hhid age  // ssc install dataex
      clear
      input byte(hhid age)
      1 64
      1 54
      1 22
      1 19
      1 13
      2 51
      2 41
      2 18
      2 16
      3 21
      4 52
      4 22
      5 55
      5 53
      5 16
      5 14
      5 13
      5  9
      5  6
      5  2
      end
      
      . list, noobs sepby(hhid) 
      
        +------------+
        | hhid   age |
        |------------|
        |    1    64 |
        |    1    54 |
        |    1    22 |
        |    1    19 |
        |    1    13 |
        |------------|
        |    2    51 |
        |    2    41 |
        |    2    18 |
        |    2    16 |
        |------------|
        |    3    21 |
        |------------|
        |    4    52 |
        |    4    22 |
        |------------|
        |    5    55 |
        |    5    53 |
        |    5    16 |
        |    5    14 |
        |    5    13 |
        |    5     9 |
        |    5     6 |
        |    5     2 |
        +------------+
      
      gsort hhid -age  // sorting so oldest would be first (because I just created the data with random ages)
      by hhid: gen n = _n
      by hhid: gen h_size = _N  // count of people in household
      
      bysort hhid (n): gen is_teen = (age <= 18)  
      egen count_teens = total(is_teen), by(hhid)
      
      list, noobs sepby(hhid) abbrev(12)
        +-------------------------------------------------+
        | hhid   age   n   h_size   is_teen   count_teens |
        |-------------------------------------------------|
        |    1    64   1        5         0             1 |
        |    1    54   2        5         0             1 |
        |    1    22   3        5         0             1 |
        |    1    19   4        5         0             1 |
        |    1    13   5        5         1             1 |
        |-------------------------------------------------|
        |    2    51   1        4         0             2 |
        |    2    41   2        4         0             2 |
        |    2    18   3        4         1             2 |
        |    2    16   4        4         1             2 |
        |-------------------------------------------------|
        |    3    21   1        1         0             0 |
        |-------------------------------------------------|
        |    4    52   1        2         0             0 |
        |    4    22   2        2         0             0 |
        |-------------------------------------------------|
        |    5    55   1        8         0             6 |
        |    5    53   2        8         0             6 |
        |    5    16   3        8         1             6 |
        |    5    14   4        8         1             6 |
        |    5    13   5        8         1             6 |
        |    5     9   6        8         1             6 |
        |    5     6   7        8         1             6 |
        |    5     2   8        8         1             6 |
        +-------------------------------------------------+
      
      drop if count_teens = 0

      Comment

      Working...
      X