Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to identify the oldest and youngest children in each HH

    Hi all,

    I would like to create variables that could identify who is the oldest and youngest children in each household. Any help is much appreciated. An example of my data is as below. While names of variables may look odd, they follow a logic, i.e. each child information in each HH is started with b0* variable and ended with b4*y variables (where * could be 2, 3, 4, 5, and 6, meaning a HH may have 5 children). I hope that labels of each variable may make those variables clear.

    Thank you.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long qid byte(b02 b22s b32r) int b42y byte(b03 b23s b33r) int b43y byte(b04 b24s b34r) int b44y byte(b05 b25s b35r) int b45y byte(b06 b26s b36r) int b46y
     11 2 2  4 1970 3 1 10 1999 4 1 10 1999 5 1 10 2001 . .  .    .
     13 2 2  2 1950 3 1  3 1975 4 2  5 1979 5 1 10 2005 6 1 10 2006
     15 2 2  2 1950 . .  .    . . .  .    . . .  .    . . .  .    .
     16 2 1  2 1945 3 1  3 1971 4 2  5 1975 5 1 10 1995 6 2 10 1998
     18 2 1  2 1931 3 2 10 1993 4 2 10 1994 . .  .    . . .  .    .
     19 . .  .    . . .  .    . . .  .    . . .  .    . . .  .    .
    110 2 1  2 1928 3 1  3 1959 4 2  5 1964 5 2 10 1988 6 1 10 1995
    111 2 2  2 1948 3 1  3 1977 4 2  5 1977 5 1 10 2005 6 2 10 2010
    112 2 1  3 1977 3 2  5 1981 4 2 10 2005 5 1 10 2006 6 2  2 1940
    113 2 2  5 1963 3 1 10 1990 4 2 10 1993 . .  .    . . .  .    .
    114 . .  .    . . .  .    . . .  .    . . .  .    . . .  .    .
    115 2 2  2 1959 3 1 10 2008 . .  .    . . .  .    . . .  .    .
    119 2 1  2 1940 3 2 10 1999 . .  .    . . .  .    . . .  .    .
    120 2 2  2 1945 3 2 10 1999 . .  .    . . .  .    . . .  .    .
    122 2 2  2 1954 . .  .    . . .  .    . . .  .    . . .  .    .
    123 2 1  2 1948 3 2  5 1983 4 1  3 1988 5 1 10 2010 . .  .    .
    124 2 2  2 1939 3 1 10 2000 . .  .    . . .  .    . . .  .    .
    125 2 1  2 1942 3 1  3 1982 4 2  5 1983 5 1 10 2009 6 2 10 2011
    126 . .  .    . . .  .    . . .  .    . . .  .    . . .  .    .
    127 . .  .    . . .  .    . . .  .    . . .  .    . . .  .    .
    128 2 2  2 1933 . .  .    . . .  .    . . .  .    . . .  .    .
    129 2 2  4 1985 3 1  5 1983 4 2 10 2010 . .  .    . . .  .    .
    134 2 1  2 1943 3 1  3 1975 4 2  5 1977 5 1 10 2005 . .  .    .
    136 2 2  2 1940 3 1  3 1963 4 2  4 1967 5 1  5 1961 6 1 10 1999
    137 2 1  3 1976 3 2  5 1977 4 2  3 1980 5 1  3 1983 6 1 10 2003
    138 2 2  4 1969 3 1  5 1968 . .  .    . . .  .    . . .  .    .
    139 2 1  3 1955 3 2  5 1957 4 1 10 1990 . .  .    . . .  .    .
    140 2 1  2 1944 3 1  3 1978 . .  .    . . .  .    . . .  .    .
    141 2 1  3 1970 3 2  5 1972 4 1 10 1998 5 1 10 2008 . .  .    .
    142 2 2  2 1933 3 2  4 1959 . .  .    . . .  .    . . .  .    .
    146 2 2  2 1945 3 1  3 1976 4 1  3 1978 5 2  5 1987 6 1 10 2005
    147 2 1  2 1934 3 1  3 1957 . .  .    . . .  .    . . .  .    .
    148 2 1  2 1936 3 1  3 1982 4 2  5 1982 5 1 10 2005 6 2 10 2007
    149 2 1  2 1945 . .  .    . . .  .    . . .  .    . . .  .    .
    150 2 2  2 1950 . .  .    . . .  .    . . .  .    . . .  .    .
    151 2 1  3 1975 3 2  5 1978 4 2 10 2007 5 2 10 2010 . .  .    .
    152 2 2  2 1958 3 1  3 1986 4 2  5 1986 5 1  3 1992 6 1 10 2006
    153 2 1  3 1955 3 2  5 1958 4 1 10 1985 5 1 10 1990 . .  .    .
    154 2 1  2 1939 3 1  3 1977 4 2  5 1989 5 2 10 2001 6 2 10 2007
    155 2 1 10 2001 3 1 10 1994 . .  .    . . .  .    . . .  .    .
    156 2 1  2 1933 3 1  3 1957 4 2  5 1956 5 1 10 1981 6 2 10 1982
    159 2 1  3 1962 3 2  5 1966 4 1 10 1986 5 2 10 1988 6 2 10 1996
    161 2 1  4 1968 3 1  3 1963 4 2  5 1964 5 2 10 1989 6 1 10 1995
    162 . .  .    . . .  .    . . .  .    . . .  .    . . .  .    .
    163 2 1  2 1927 3 1  3 1956 4 2  5 1983 5 1 10 2003 . .  .    .
    164 2 1  2 1949 3 1  3 1973 4 2  5 1973 5 1  3 1979 6 2  5 1980
    167 2 2  2 1948 3 2  4 1978 4 1  5 1983 5 2 10 2010 6 2  4 1980
    168 2 1  2 1930 3 1  3 1973 4 2  4 1975 5 1 10 2000 6 2 10 2004
    169 2 2  2 1931 3 1  3 1959 4 2  5 1966 5 1 10 1992 . .  .    .
    171 2 1  2 1921 3 1  3 1962 4 2  5 1965 5 1 10 1989 6 1 10 2000
    173 2 2  2 1938 3 1  3 1970 4 2  5 1987 . .  .    . . .  .    .
    175 2 2  2 1943 3 1  3 1980 . .  .    . . .  .    . . .  .    .
    176 2 2  2 1945 3 1  3 1979 4 2 10 2000 . .  .    . . .  .    .
    177 2 2  2 1943 . .  .    . . .  .    . . .  .    . . .  .    .
    178 2 1  2 1946 3 1  3 1974 4 2  5 1975 5 1 10 1995 6 2 10 2005
    180 2 2  2 1953 3 1  3 1990 . .  .    . . .  .    . . .  .    .
    181 2 1  2 1950 3 1  3 1985 . .  .    . . .  .    . . .  .    .
    182 2 1  3 1957 3 2  4 1962 4 1 10 1993 . .  .    . . .  .    .
    183 2 2  4 1953 3 2  4 1949 4 1  3 1962 5 2  5 1965 . .  .    .
    184 2 2  7 1931 3 2 12 1953 4 1 12 1962 5 2 12 1965 . .  .    .
    186 2 1  3 1955 3 2  5 1962 4 1 10 1986 5 2 10 1985 6 1 10 2007
    188 2 1  3 1964 . .  .    . . .  .    . . .  .    . . .  .    .
    190 2 1  3 1971 3 2  5 1974 4 2 10 2001 5 2 10 2009 . .  .    .
    191 2 2  2 1951 3 1  3 1984 . .  .    . . .  .    . . .  .    .
    192 2 2  2 1946 3 1  3 1971 4 1  3 1977 . .  .    . . .  .    .
    193 2 1  3 1964 3 2  5 1968 4 1 10 2003 5 2 10 1995 . .  .    .
    194 2 2  2 1941 3 1 10 1986 4 1 10 2007 . .  .    . . .  .    .
    196 2 2  4 1956 . .  .    . . .  .    . . .  .    . . .  .    .
    197 2 1  2 1949 3 1  3 1981 4 2  5 1985 5 2 10 2007 . .  .    .
    199 2 1  2 1931 . .  .    . . .  .    . . .  .    . . .  .    .
    331 2 2  2 1963 3 2  4 1993 . .  .    . . .  .    . . .  .    .
    332 2 1  3 1968 3 2  5 1970 4 1 10 1993 . .  .    . . .  .    .
    333 2 1  2 1932 3 1  3 1973 4 2  5 1975 5 1 10 1997 6 2 10 2000
    336 2 1  2 1947 3 1  3 1975 4 2  5 1979 5 1 10 2002 6 2 10 2006
    337 2 1  3 1965 3 2  5 1970 4 1 10 1992 5 1 10 1996 . .  .    .
    338 2 2  4 1967 3 1 10 2006 . .  .    . . .  .    . . .  .    .
    362 2 1  2 1933 3 1  3 1963 4 2  5 1967 5 2 10 1994 6 1 10 2003
    363 2 2  2 1952 3 2  4 1977 4 1 10 1998 . .  .    . . .  .    .
    366 2 2  2 1940 3 1  3 1960 4 2  5 1970 5 1 10 1992 . .  .    .
    367 2 2  2 1952 3 2 10 1992 . .  .    . . .  .    . . .  .    .
    368 2 2  2 1950 3 1 10 2008 . .  .    . . .  .    . . .  .    .
    369 2 2  2 1930 3 1  3 1949 4 2  5 1949 5 2 10 1987 6 1 10 1994
    382 2 1  2 1929 . .  .    . . .  .    . . .  .    . . .  .    .
    383 2 2  2 1939 . .  .    . . .  .    . . .  .    . . .  .    .
    384 2 1  2 1940 3 1  3 1975 4 2  5 1982 5 1 10 2006 . .  .    .
    385 2 2  2 1951 . .  .    . . .  .    . . .  .    . . .  .    .
    387 2 1  2 1937 3 1  3 1977 4 2  5 1980 5 2 10 2006 6 1 10 2009
    389 2 2  2 1947 3 1  3 1975 4 2  5 1982 5 1 10 2006 . .  .    .
    461 2 2  2 1931 3 1 10 1979 4 2 10 1992 . .  .    . . .  .    .
    462 2 1  2 1927 3 1 10 1979 4 2 10 1992 . .  .    . . .  .    .
    463 2 2  2 1947 3 1  3 1979 . .  .    . . .  .    . . .  .    .
    464 2 1  2 1940 3 1  3 1973 4 2  5 1977 5 2 10 2006 6 1 10 2008
    465 2 1  3 1966 3 2  5 1974 4 2 10 2003 5 2 10 2011 . .  .    .
    466 2 2  2 1960 . .  .    . . .  .    . . .  .    . . .  .    .
    469 2 1  2 1947 3 1  3 1981 4 2 10 1984 5 2 10 2001 6 1 10 2005
    491 2 2  2 1961 3 2  4 1983 4 2  4 1991 5 2 10 2011 . .  .    .
    493 2 2  2 1935 3 2  4 1958 . .  .    . . .  .    . . .  .    .
    494 2 1  3 1982 . .  .    . . .  .    . . .  .    . . .  .    .
    496 2 2  2 1944 3 1  3 1973 4 2  5 1977 5 2 10 2001 6 2 10 2009
    497 2 2  2 1950 3 2  4 1983 . .  .    . . .  .    . . .  .    .
    end
    label values b22s LABEL_B22
    label def LABEL_B22 1 "Male", modify
    label def LABEL_B22 2 "Female", modify
    label values b32r LABEL_B32R
    label def LABEL_B32R 2 "Spouse", modify
    label def LABEL_B32R 3 "Son", modify
    label def LABEL_B32R 4 "Daughter", modify
    label def LABEL_B32R 5 "Son/daughter in law", modify
    label def LABEL_B32R 7 "Parent", modify
    label def LABEL_B32R 10 "Grand children", modify
    label values b23s LABEL_B23
    label def LABEL_B23 1 "Male", modify
    label def LABEL_B23 2 "Female", modify
    label values b33r LABEL_B33R
    label def LABEL_B33R 3 "Son", modify
    label def LABEL_B33R 4 "Daughter", modify
    label def LABEL_B33R 5 "Son/daughter in law", modify
    label def LABEL_B33R 10 "Grand children", modify
    label def LABEL_B33R 12 "Other relatives", modify
    label values b24s LABEL_B24
    label def LABEL_B24 1 "Male", modify
    label def LABEL_B24 2 "Female", modify
    label values b34r LABEL_B34R
    label def LABEL_B34R 3 "Son", modify
    label def LABEL_B34R 4 "Daughter", modify
    label def LABEL_B34R 5 "Son/daughter in law", modify
    label def LABEL_B34R 10 "Grand children", modify
    label def LABEL_B34R 12 "Other relatives", modify
    label values b25s LABEL_B25
    label def LABEL_B25 1 "Male", modify
    label def LABEL_B25 2 "Female", modify
    label values b35r LABEL_B35R
    label def LABEL_B35R 3 "Son", modify
    label def LABEL_B35R 5 "Son/daughter in law", modify
    label def LABEL_B35R 10 "Grand children", modify
    label def LABEL_B35R 12 "Other relatives", modify
    label values b26s LABEL_B26
    label def LABEL_B26 1 "Male", modify
    label def LABEL_B26 2 "Female", modify
    label values b36r LABEL_B36R
    label def LABEL_B36R 2 "Spouse", modify
    label def LABEL_B36R 4 "Daughter", modify
    label def LABEL_B36R 5 "Son/daughter in law", modify
    label def LABEL_B36R 10 "Grand children", modify

  • #2
    Does this help? If there were multiple births in the same year, this code chooses the first identifier out of 2 to 6 that occurs with the same year.

    Code:
    egen oldest_age = rowmin(b4?y)
    egen youngest_age = rowmax(b?4y)
    
    gen oldest_id = .
    gen youngest_id = .
    forval j = 2/6 {
        replace oldest_id = `j' if oldest_age == b4`j'y & missing(oldest_id)
        replace youngest_id = `j' if youngest_age == b4`j'y & missing(youngest_id)
    }
    
    list oldest* youngest* if !missing(oldest_age) | !missing(youngest_age)
    For a review of technique see https://www.stata-journal.com/articl...article=pr0046 and if more examples are wanted references here to pr0046

    A sequel to that paper is in press for Stata Journal 20(2).

    The code does nothing special for the case of one child, who is both oldest and youngest.

    Some Statalist friends would want me to underline that reshape long makes many problems easier here.

    Comment


    • #3
      Originally posted by Nick Cox View Post
      Does this help? If there were multiple births in the same year, this code chooses the first identifier out of 2 to 6 that occurs with the same year.

      Code:
      egen oldest_age = rowmin(b4?y)
      egen youngest_age = rowmax(b?4y)
      
      gen oldest_id = .
      gen youngest_id = .
      forval j = 2/6 {
      replace oldest_id = `j' if oldest_age == b4`j'y & missing(oldest_id)
      replace youngest_id = `j' if youngest_age == b4`j'y & missing(youngest_id)
      }
      
      list oldest* youngest* if !missing(oldest_age) | !missing(youngest_age)
      For a review of technique see https://www.stata-journal.com/articl...article=pr0046 and if more examples are wanted references here to pr0046

      A sequel to that paper is in press for Stata Journal 20(2).

      The code does nothing special for the case of one child, who is both oldest and youngest.

      Some Statalist friends would want me to underline that reshape long makes many problems easier here.
      Dear Professor Nick,

      Thank you for your help, your codes work. However, because not all of 5 people are children, they could be spouse or grandchildren so I would add one more condition to your codes as following ( in red)
      Code:
      forval j = 2/6 {
          replace oldest_id = `j' if oldest_age == b4`j'y & missing(oldest_id) & (b3`j'r==3 | b3`j'r==4 | b3`j'r==5)
          replace youngest_id = `j' if youngest_age == b4`j'y & missing(youngest_id) & (b3`j'r==3 | b3`j'r==4 | b3`j'r==5)
      }
      Also, as you pointed out that someone may want to reshape the data from a wide layout to a long one before solving the issue presented here. Could you please demonstrate how to do it? I just want to learn another method.

      Thank you.

      Comment


      • #4
        I was puzzled by people with birth dates like 1940 in the results, but was taking #1 literally and figuring everyone is someone else's child.

        Your fix needs to be more pervasive. I haven't tested this. age really means year of birth....


        Code:
        foreach v in oldest_id youngest_id oldest_age  { 
              gen `v' = . 
        } 
        gen youngest_age = 0  
         forval j = 6(-1)2 {     replace oldest_age = b4`j'y if b4`j'y < oldest_age & inrange(b3`j'r, 3, 5)      replace oldest_id = `j' if oldest_age == b4`j'y      replace youngest_age = b4`j'y if b4`j'y > youngest_age & inrange(b3`j'r, 3, 5)     replace youngest_id = `j' if youngest_age == b4`j'y  }  list oldest* youngest* if !missing(oldest_age) | !missing(youngest_age)
        There are so many examples of reshape long in this forum that I don't feel energetic enough to add another.

        Comment


        • #5
          Sorry: that got mangled by the forum software. Copy and paste within the software often loses end-of-lines. Usually I spot that, but something else got in the way.

          Code:
          foreach v in oldest_id youngest_id oldest_age  {        
               gen `v' = .  
          }  
          
          gen youngest_age = 0    
          
          forval j = 6(-1)2 {          
              replace oldest_age = b4`j'y if b4`j'y < oldest_age & inrange(b3`j'r, 3, 5)          
              replace oldest_id = `j' if oldest_age == b4`j'y            
              replace youngest_age = b4`j'y if b4`j'y > youngest_age & inrange(b3`j'r, 3, 5)          
              replace youngest_id = `j' if youngest_age == b4`j'y  
          }    
          
          list oldest* youngest* if !missing(oldest_age) | !missing(youngest_age)
          Last edited by Nick Cox; 10 May 2020, 08:35.

          Comment


          • #6
            Dear Professor Nick Cox,

            Thank you for your response but I am confusing here. In #2, oldest_year oldest_id, youngest_id, and youngest_year were already generated but it seems you tried to generate those existed variables in #5 so undoubtedly Stata reported errors. Another thought is that I misunderstood your coding procedures here. Please correct me if I am wrong.

            Codes in #1
            Code:
            egen oldest_age = rowmin(b4?y)
            egen youngest_age = rowmax(b?4y)
            
            gen oldest_id = .
            gen youngest_id = .
            Codes in #5
            Code:
            foreach v in oldest_id youngest_id oldest_age  {        
                 gen `v' = .  
            }  
            
            gen youngest_age = 0
            Stata report
            variable oldest_id already defined
            Thank you.

            Comment


            • #7
              Naturally, you need to get rid of previous versions of the variables. The code in #5 is a replacement for the code in #2, not an extension of it.

              "Jason Bourne" looks like a pseudonym to me. It's evident that English is not your first language, which is more than fine, and indeed you are better off telling us your real name. https://www.statalist.org/forums/help#realnames

              Comment

              Working...
              X