Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to create a new variable based on household composition

    Hello,

    For my research, I have a demographic survey with different households and different observations per households (the individuals). Each individual is defined by the relationship he has towards the head of the household.

    Here is my data:
    individualID householdID gender relation_to_head
    1 1 0 1
    2 1 1 2
    3 1 1 3
    4 1 0 3
    5 2 0 1
    6 2 1 2
    7 2 1 6
    8 2 0 3
    9 2 0 3
    10 3 0 1
    11 3 1 2
    12 4 0 1
    13 4 1 2
    14 4 1 3
    15 5 0 1
    16 5 1 2
    17 5 0 3
    18 5 1 4
    19 5 0 5
    20 5 0 5
    The key for understanding the relationship_to_head variable is the following:
    1=Head
    2=Spouse
    3=Child
    4=Mother/father
    5=Grandchild
    6=Child-in-law
    For gender, 0=male and 1=female

    The new variable that I want to create is whether the mother-in-law is present in the household. If we follow the logic, it should give only the householdID=2 and householdID=5.

    Can you guys help me writing the code line for this new variable? I assume it has something to do with by household: but I can't add relationship_to_head==1 & relationship_to_head==2, etc.

    Thank you very much in advance.
    Last edited by Valentine Willemin; 01 Feb 2019, 09:00.

  • #2
    See if this gets you what you want:

    Code:
    * Just shortening some var names so the "list" will work
    rename individualid id
    rename householdid hhid
    rename gender female  // changed name to make it easier to remember female==1
    
    gen mil = 0
    replace mil = 1 if (female==1 & relation==6) | (female==1 & relation==4)  // this is how I defined mother-in-law.  Make sure it is what you want
    egen count_mil = total(mil), by(hhid)
    gen has_mil = (count_mil >=1)
    
    . list , sepby(hhid) abbrev(18) noobs
    
      +-------------------------------------------------------------------+
      | hhid   id   female   relation_to_head   mil   count_mil   has_mil |
      |-------------------------------------------------------------------|
      |    1    1        0                  1     0           0         0 |
      |    1    2        1                  2     0           0         0 |
      |    1    3        1                  3     0           0         0 |
      |    1    4        0                  3     0           0         0 |
      |-------------------------------------------------------------------|
      |    2    5        0                  1     0           1         1 |
      |    2    6        1                  2     0           1         1 |
      |    2    7        1                  6     1           1         1 |
      |    2    8        0                  3     0           1         1 |
      |    2    9        0                  3     0           1         1 |
      |-------------------------------------------------------------------|
      |    3   10        0                  1     0           0         0 |
      |    3   11        1                  2     0           0         0 |
      |-------------------------------------------------------------------|
      |    4   12        0                  1     0           0         0 |
      |    4   13        1                  2     0           0         0 |
      |    4   14        1                  3     0           0         0 |
      |-------------------------------------------------------------------|
      |    5   15        0                  1     0           1         1 |
      |    5   16        1                  2     0           1         1 |
      |    5   17        0                  3     0           1         1 |
      |    5   18        1                  4     1           1         1 |
      |    5   19        0                  5     0           1         1 |
      |    5   20        0                  5     0           1         1 |
      +-------------------------------------------------------------------+
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(hhid id female relation_to_head)
    1  1 0 1
    1  2 1 2
    1  3 1 3
    1  4 0 3
    2  5 0 1
    2  6 1 2
    2  7 1 6
    2  8 0 3
    2  9 0 3
    3 10 0 1
    3 11 1 2
    4 12 0 1
    4 13 1 2
    4 14 1 3
    5 15 0 1
    5 16 1 2
    5 17 0 3
    5 18 1 4
    5 19 0 5
    5 20 0 5
    end
    Last edited by David Benson; 01 Feb 2019, 10:58.

    Comment


    • #3
      Originally posted by David Benson View Post
      See if this gets you what you want:

      Code:
      * Just shortening some var names so the "list" will work
      rename individualid id
      rename householdid hhid
      rename gender female // changed name to make it easier to remember female==1
      
      gen mil = 0
      replace mil = 1 if (female==1 & relation==6) | (female==1 & relation==4) // this is how I defined mother-in-law. Make sure it is what you want
      egen count_mil = total(mil), by(hhid)
      gen has_mil = (count_mil >=1)
      
      . list , sepby(hhid) abbrev(18) noobs
      
      +-------------------------------------------------------------------+
      | hhid id female relation_to_head mil count_mil has_mil |
      |-------------------------------------------------------------------|
      | 1 1 0 1 0 0 0 |
      | 1 2 1 2 0 0 0 |
      | 1 3 1 3 0 0 0 |
      | 1 4 0 3 0 0 0 |
      |-------------------------------------------------------------------|
      | 2 5 0 1 0 1 1 |
      | 2 6 1 2 0 1 1 |
      | 2 7 1 6 1 1 1 |
      | 2 8 0 3 0 1 1 |
      | 2 9 0 3 0 1 1 |
      |-------------------------------------------------------------------|
      | 3 10 0 1 0 0 0 |
      | 3 11 1 2 0 0 0 |
      |-------------------------------------------------------------------|
      | 4 12 0 1 0 0 0 |
      | 4 13 1 2 0 0 0 |
      | 4 14 1 3 0 0 0 |
      |-------------------------------------------------------------------|
      | 5 15 0 1 0 1 1 |
      | 5 16 1 2 0 1 1 |
      | 5 17 0 3 0 1 1 |
      | 5 18 1 4 1 1 1 |
      | 5 19 0 5 0 1 1 |
      | 5 20 0 5 0 1 1 |
      +-------------------------------------------------------------------+
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input byte(hhid id female relation_to_head)
      1 1 0 1
      1 2 1 2
      1 3 1 3
      1 4 0 3
      2 5 0 1
      2 6 1 2
      2 7 1 6
      2 8 0 3
      2 9 0 3
      3 10 0 1
      3 11 1 2
      4 12 0 1
      4 13 1 2
      4 14 1 3
      5 15 0 1
      5 16 1 2
      5 17 0 3
      5 18 1 4
      5 19 0 5
      5 20 0 5
      end
      Thanks for your response David. I have been thinking about this question too but I have failed to make headway. However in your response don't you think you need to take into account the following:
      1. There are both female-headed and male- headed households.
      2. A female-headed household without a spouse but a child-in-law is a mother-in-law herself.
      3. A female-headed household with a spouse and a mother but without a child-in-law has a mother-in-law. I think this is so because the mother of the household head is a mother-in-law to the spouse.
      4. A male headed household with a spouse and child-in-law has a mother-in-law.
      5. A male headed household with a spouse and a female parent also has a mother-in-law because the mother of the household head is also the mother-in-law to the spouse. e.t.c
      Note:
      I haven't really thought about same sex spouses and
      I'm taking mother-in-law to be the mother of one's husband or wife. It appears your answer is ignoring some of the above possibilities. I don't see some of the possibilities in the dataset she provided but since what she provided is a portion of the entire dataset, I think the other possibilities could be available in the entire dataset as well.

      Please help me understand if there is anything I am not getting right in my interpretation of the question.
      ​​​​​​​This is an important question for me as well.

      Thanks

      Comment


      • #4
        Hi Joseph,

        You are right--households are complicated and my definition of mother-in-law in #2 is missing many of the variations you list. Also, I made a mistake in (female==1 & relation==6), because the child-in-law can be either male or female and still have a mother-in-law in the house.

        Honestly, I had 15 minutes between meetings and hoped to help Valentine going and used the data from the 5 households she provided. I find that with many questions like these, what the poster is asking is something like, "if 1 member of the household has x, how do I make it so the whole household gets flagged...." That's why I flagged "this is how I defined mother-in-law. Make sure it is what you want". You can then add lots of various permutations, and then use some of my code in #2 to fill it in for the household.

        The easiest way to proceed is probably to post some sample data that has the permutations you mention. I took a stab at it adding hhid 6-8 below. Also, based on your points, if seemed if things would be easier with an indicator variable for female-headed household, and male-headed household with a spouse.


        Code:
        * I added a few households with some of the variations
        dataex id hhid female relation
        clear
        input byte(id hhid female relation)
         3 1 1 3
         2 1 1 2
         4 1 0 3
         1 1 0 1
         7 2 1 6
         6 2 1 2
         8 2 0 3
         9 2 0 3
         5 2 0 1
        10 3 0 1
        11 3 1 2
        14 4 1 3
        12 4 0 1
        13 4 1 2
        18 5 1 4
        16 5 1 2
        15 5 0 1
        20 5 0 5
        19 5 0 5
        17 5 0 3
        23 6 1 6
        22 6 0 3
        21 6 1 1
        24 6 1 3
        27 7 0 3
        25 7 1 1
        26 7 1 3
        30 8 1 4
        28 8 0 1
        29 8 1 2
        end

        Code:
        * gen Female head-of-household
        sort hhid relation
        gen female_hoh = 0
        bysort hhid (relation): replace female_hoh=1 if female[1]==1 & relation[1]==1
        
        
        * gen male_hoh with spouse in home
        bysort hhid (relation): gen male_with_spouse = (female[1]==0 & relation[2]==2)
        
        * Rule #2: female_hoh with a child-in-law 
        gen temp_mil = 1 if female_hoh==1 & relation==6
        sort hhid temp_mil
        by hhid: replace has_mil=1 if temp_mil[1]==1
        
        * Rule #4 male headed household with a spouse and child-in-law
        replace temp_mil = 1 if male_with_spouse==1 & relation==6
        sort hhid temp_mil
        by hhid: replace has_mil=1 if temp_mil[1]==1
        
        * Rule #5 A male headed household with a spouse and a female parent
        replace temp_mil = 1 if male_with_spouse==1 & relation==4 & female==1
        sort hhid temp_mil
        by hhid: replace has_mil=1 if temp_mil[1]==1
        
        . list, sepby(hhid) noobs abbrev(18)
        
          +------------------------------------------------------------------------------------------------------+
          | id   hhid   female   relation   mil   count_mil   has_mil   female_hoh   male_with_spouse   temp_mil |
          |------------------------------------------------------------------------------------------------------|
          |  1      1        0          1     0           0         0            0                  1          . |
          |  2      1        1          2     0           0         0            0                  1          . |
          |  3      1        1          3     0           0         0            0                  1          . |
          |  4      1        0          3     0           0         0            0                  1          . |
          |------------------------------------------------------------------------------------------------------|
          |  5      2        0          1     0           1         1            0                  1          . |
          |  6      2        1          2     0           1         1            0                  1          . |
          |  9      2        0          3     0           1         1            0                  1          . |
          |  8      2        0          3     0           1         1            0                  1          . |
          |  7      2        1          6     1           1         1            0                  1          1 |
          |------------------------------------------------------------------------------------------------------|
          | 10      3        0          1     0           0         0            0                  1          . |
          | 11      3        1          2     0           0         0            0                  1          . |
          |------------------------------------------------------------------------------------------------------|
          | 12      4        0          1     0           0         0            0                  1          . |
          | 13      4        1          2     0           0         0            0                  1          . |
          | 14      4        1          3     0           0         0            0                  1          . |
          |------------------------------------------------------------------------------------------------------|
          | 15      5        0          1     0           1         1            0                  1          . |
          | 16      5        1          2     0           1         1            0                  1          . |
          | 17      5        0          3     0           1         1            0                  1          . |
          | 18      5        1          4     1           1         1            0                  1          1 |
          | 19      5        0          5     0           1         1            0                  1          . |
          | 20      5        0          5     0           1         1            0                  1          . |
          |------------------------------------------------------------------------------------------------------|
          | 21      6        1          1     0           1         1            1                  0          . |
          | 22      6        0          3     0           1         1            1                  0          . |
          | 24      6        1          3     0           1         1            1                  0          . |
          | 23      6        1          6     1           1         1            1                  0          1 |
          |------------------------------------------------------------------------------------------------------|
          | 25      7        1          1     0           0         0            1                  0          . |
          | 27      7        0          3     0           0         0            1                  0          . |
          | 26      7        1          3     0           0         0            1                  0          . |
          |------------------------------------------------------------------------------------------------------|
          | 28      8        0          1     0           1         1            0                  1          . |
          | 29      8        1          2     0           1         1            0                  1          . |
          | 30      8        1          4     1           1         1            0                  1          1 |
          +------------------------------------------------------------------------------------------------------+
        Hopefully this points you in the right direction--there may be other variations that you notice in the data.

        Also note: currently the code doesn't drop temp_mil between iterations, you may want to do that and re-create it each time. I called it temp_mil because it flags children-in-law with a mother in the hh (so he/she isn't the mil, but has a mother-in-law in the household).

        Also, temp_mil is 1 or blank. That makes it nice for sorting, but you usually would want to change those blanks to 0's.

        Hope this helps!

        --David
        Last edited by David Benson; 01 Feb 2019, 14:46.

        Comment


        • #5
          Thank you so much David and Joseph!

          With the rules provided, the data is perfectly sorted out.

          Comment


          • #6
            i have a similar question. i have the following information , i have individual incomes and i want to make a family income by grouping the indivuals income how is it done


            ----------------------- copy starting from the next line -----------------------
            [CODE]
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input double(ID person_number) float income
            3500 2 350
            460 1 4500
            93 3 1875
            4361 2 1625
            2385 1 8500
            3088 2 2625
            6995 1 850
            9054 1 250
            2277 2 1875
            8928 1 750
            2168 1 550
            2175 2 550
            3960 2 1375
            3271 1 950
            3046 1 550
            1617 3 1625
            7140 1 750
            6683 1 8500
            3970 1 1375
            2545 1 550
            2867 2 1375
            1529 1 1625
            3372 4 8500
            2523 2 550
            539 2 1125
            7976 1 1375
            8877 1 1625
            7280 2 850
            751 4 750
            1757 2 850
            6988 5 1875
            8504 1 1375
            8364 1 1375
            4410 2 3875
            3765 1 2375
            3738 1 1875
            6379 1 1625
            4053 1 550
            2271 2 550
            5693 2 450
            4789 3 350
            2232 1 12500
            6319 2 950
            2382 1 1375
            2123 2 50
            7139 3 450
            3268 1 750
            3904 2 .
            2619 1 750
            2467 3 1375
            6237 7 550
            5158 11 350
            3763 2 1875
            8818 1 750
            4645 3 950
            5624 1 1375
            1372 2 950
            653 1 2125
            1588 3 950
            5097 1 2375
            7309 1 550
            4212 1 550
            2514 2 2125
            7545 4 1375
            7590 1 50
            7573 1 550
            1269 1 550
            244 2 3375
            4277 3 950
            6027 1 550
            1782 3 550
            553 2 1625
            8821 2 650
            1718 1 2625
            599 4 450
            2250 4 1125
            8958 1 550
            3134 2 250
            940 1 550
            66 2 1125
            113 1 1625
            4282 2 150
            8477 4 850
            1879 3 1875
            3956 1 2125
            3733 2 750
            2382 2 2125
            612 1 50
            8026 1 50
            2036 2 .
            1996 1 650
            7091 2 950
            2102 1 550
            6711 2 250
            1447 1 950
            6728 2 1625
            1898 1 2875
            8019 1 1375
            2638 1 2875
            2266 5 1375

            Comment


            • #7
              Hi KAYDEN MARLI ,

              You might want to start a new thread on this.
              Thanks for sharing your data using dataex. The challenge is that you need some sort of household_id to group by, and then you can use the egen commands (egen is usually the command to use if you want to summarize by a group and give every member of that group the same value).
              There are a number of posts on Statalist showing how to do variations of this. See here, here, here, and here for examples. Also, if you search Statalist for "hhid" you will find many other examples.

              I didn't see a household_id in the data that you shared, so I created one (and then used some of your data).


              Code:
              * Example generated by -dataex-. For more info, type help dataex
              * dataex id hhid person_number income
              clear
              input int id byte(hhid person_number) int income
               460 1 1 4500
                93 1 3 1875
              4361 1 2 1625
              2385 2 1 8500
              3088 2 2 2625
              6995 3 1  850
              9054 4 1  250
              2277 4 2 1875
              8928 5 1  750
              2168 6 1  550
              2175 6 2  550
              3960 7 1 1375
              3271 7 2  950
              3046 7 3  550
              1617 7 4 1625
              end
              
              * hhid means "household_id"
              egen hh_income = sum(income), by(hhid)
              format hh_income %8.0gc   /* Just making it comma-separated for thousands  */
              
              list, sepby(hhid) abbrev(14)  noobs
              
                +--------------------------------------------------+
                |   id   hhid   person_number   income   hh_income |
                |--------------------------------------------------|
                |  460      1               1    4,500       8,000 |
                |   93      1               3    1,875       8,000 |
                | 4361      1               2    1,625       8,000 |
                |--------------------------------------------------|
                | 2385      2               1    8,500      11,125 |
                | 3088      2               2    2,625      11,125 |
                |--------------------------------------------------|
                | 6995      3               1      850         850 |
                |--------------------------------------------------|
                | 9054      4               1      250       2,125 |
                | 2277      4               2    1,875       2,125 |
                |--------------------------------------------------|
                | 8928      5               1      750         750 |
                |--------------------------------------------------|
                | 2168      6               1      550       1,100 |
                | 2175      6               2      550       1,100 |
                |--------------------------------------------------|
                | 3960      7               1    1,375       4,500 |
                | 3271      7               2      950       4,500 |
                | 3046      7               3      550       4,500 |
                | 1617      7               4    1,625       4,500 |
                +--------------------------------------------------+

              Comment


              • #8
                thnk u very much david
                Last edited by KAYDEN MARLI; 06 Nov 2021, 05:43.

                Comment


                • #9
                  i have a question when i execute your command i get a total of 17000 households, my household file has 14000 how come

                  Comment


                  • #10
                    i have execute your command but when i want to analyze on household level i get too many values error

                    Comment

                    Working...
                    X