Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Attribute of household mambers within a household

    Hi,

    I have the following data structure:
    Household ID Age Father Age_Father
    1 1 46 . .
    1 2 52 . .
    1 3 20 2 .
    2 1 36 .
    2 2 15 1 .
    The variable "Father" indicating the id of the father of a particular child. How can I construct the variable "Age_Father" that records the age of father for each of the children that matched with their father.

    Thanks so much.

  • #2
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(household id age father)
    1 1 46 .
    1 2 52 .
    1 3 20 2
    2 1 36 .
    2 2 15 1
    end
    
    by household, sort: gen age_father = age[father]
    In the future, please do not use HTML tables to post example data. They can be very difficult to import into Stata. The helpful way to show data examples is with the -dataex- command, as I have done above. You can install the -dataex- command by running -ssc install dataex-. Then run -help dataex- to read the simple instructions. Use it always when you want to show example data on this Forum.

    Note: This code is strictly dependent on id starting at 1 and increasing through consecutive integers in each household. If that is not true of your data in general, please show an example where this does not hold, and different code that accommodates it can be written.

    Comment


    • #3
      Thanks Clyde. I am sorry for using the tables.

      I ran into the problem that you mentioned: that the id does not start at 1 and it is not increasing in consecutive integers. So families will have the following data (I'm using -dataex-):
      clear
      input byte(household id age father)
      1 1 46 .
      1 2 52 .
      1 3 20 2
      2 1 36 .
      2 2 15 1
      3 2 46 .
      3 3 12 2
      3 4 2 2
      4 1 38 .
      4 3 12 1
      4 4 15 1
      4 5 11 1
      end
      I tried this code and it does not work for household 3 as it only returns the second individual's age. Thanks

      Comment


      • #4
        You can do this with rangestat (SSC), which you must install first. Look in the help: one of the examples is similar and explains the logic.

        Code:
        clear
        input byte(household id age father)
        1 1 46 .
        1 2 52 .
        1 3 20 2
        2 1 36 .
        2 2 15 1
        3 2 46 .
        3 3 12 2
        3 4 2 2
        4 1 38 .
        4 3 12 1
        4 4 15 1
        4 5 11 1
        end
        
        gen targetid = cond(missing(father), 0, father) 
        
        * do the installation just once 
        ssc inst rangestat 
        
        rangestat (min) father_age=age, interval(id targetid targetid) by(household) 
        
        list, sepby(household) 
        
             +----------------------------------------------------+
             | househ~d   id   age   father   targetid   father~e |
             |----------------------------------------------------|
          1. |        1    1    46        .          0          . |
          2. |        1    2    52        .          0          . |
          3. |        1    3    20        2          2         52 |
             |----------------------------------------------------|
          4. |        2    1    36        .          0          . |
          5. |        2    2    15        1          1         36 |
             |----------------------------------------------------|
          6. |        3    2    46        .          0          . |
          7. |        3    3    12        2          2         46 |
          8. |        3    4     2        2          2         46 |
             |----------------------------------------------------|
          9. |        4    1    38        .          0          . |
         10. |        4    3    12        1          1         38 |
         11. |        4    4    15        1          1         38 |
         12. |        4    5    11        1          1         38 |
             +----------------------------------------------------+

        Comment


        • #5
          Right, as I said, that code would only work with consecutive id numbers starting at 1. So here's a different approach that is more general:

          Code:
          clear
          input byte(household id age father)
          1 1 46 .
          1 2 52 .
          1 3 20 2
          2 1 36 .
          2 2 15 1
          3 2 46 .
          3 3 12 2
          3 4 2 2
          4 1 38 .
          4 3 12 1
          4 4 15 1
          4 5 11 1
          end
          
          // FIND A NUMBER THAT NEVER APPEARS AS An ID
          summ id, meanonly
          local mvcode = r(max) + 1
          
          //    GENERATE A NEW FATHER VARIABLE WITH MISSING VALUES
          //    REPLACED BY mvcode
          clonevar father2 = father
          replace father2 = `mvcode' if missing(father)
          
          //    now calculate father ages
          rangestat father_age = age, by(household) interval(id father2 father2)
          Notes:

          1. You need to install the -rangestat- command if you don't already have it. It's written by Robert Picard and you can get it by running -ssc install rangestat-. It's really quite useful for lots of things, so worth having as part of your Stata installation anyway.

          2. The need for the father2 variable, which recodes the missing values of father to a value that does not appear anywhere as an id value in the data set is because -rangestat- will not handle the missing values in the way you would want it to.

          And thanks for using -dataex-.

          Added: Crossed with Nick's post, which proposes essentially the same solution. Nick uses 0 for the number that never occurs as an id. And his -rangestat- command looks for the minimum value of age whereas mine looks for the mean. But since there is only one such value anyway, those will be the same.

          Also added: -rangestat- is not just by Robert Picard. Nick Cox is a co-author, as is Roberto Ferrer. Apologies!
          Last edited by Clyde Schechter; 14 Mar 2017, 13:24.

          Comment


          • #6
            Thanks Nick and Clyde!

            Comment

            Working...
            X