Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with count inside variable

    Hello everyone, I hope you are well

    I need help to count the number of existing households in a database from the household_id.
    Then, I want to know the average number of members per total household and per ethnicity.
    In addition, I want to know in how many households (%) the head of household (relations=1) is indigenous (ethnicity!=1).
    My database contains more than 12 million records and the truth is that I don't know how to move forward with that. I could use any help



    The variables I put in the data are:
    id_individual: identifier of each subject.
    id_household: identifier of each household.
    members: refers to the number of members per household.
    ethnicity: 1. non-indigenous, 2. indigenous_groupA, 3. indigenous_groupB, 4. indigenous_groupC.
    relations: 1.head_of_household, 2. partner_head, 3. children. 4.grandchildren, 5.other_family


    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(id_individual id_household members ethnic relation)
     1  1 3 1 1
     2  1 3 1 2
     3  1 3 2 3
     4  2 3 2 1
     5  2 3 2 2
     6  2 3 2 3
     7  3 2 3 1
     8  3 2 3 3
     9  4 1 4 1
    10  5 1 4 1
    11  6 5 1 1
    12  6 5 1 2
    13  6 5 1 3
    14  6 5 1 4
    15  6 5 1 5
    16  7 3 1 1
    17  7 3 1 3
    18  7 3 2 4
    19  8 1 1 1
    20  9 3 1 1
    21  9 3 1 2
    22  9 3 1 3
    23 10 4 1 1
    24 10 4 1 2
    25 10 4 1 3
    26 10 4 1 4
    end
    ------------------ copy up to and including the previous line ------------------






    I appreciate any kind of help
    Thank you very much

    Best regards

    Moses


  • #2
    I am more than a little confused about why you want this and where you are going with it, but it appears to me that a couple of simple tables give what you ask for:
    Code:
    . ta eth relation
    
               |                        relation
        ethnic |         1          2          3          4          5 |     Total
    -----------+-------------------------------------------------------+----------
             1 |         6          4          4          2          1 |        17
             2 |         1          1          2          1          0 |         5
             3 |         1          0          1          0          0 |         2
             4 |         2          0          0          0          0 |         2
    -----------+-------------------------------------------------------+----------
         Total |        10          5          7          3          1 |        26
    r; t=0.00 16:02:10
    
    . ta id_h eth, su(members) nost nofreq
    
                                 Means of members
    
    id_househo |                 ethnic
            ld |         1          2          3          4 |     Total
    -----------+--------------------------------------------+----------
             1 |         3          3          .          . |         3
             2 |         .          3          .          . |         3
             3 |         .          .          2          . |         2
             4 |         .          .          .          1 |         1
             5 |         .          .          .          1 |         1
             6 |         5          .          .          . |         5
             7 |         3          3          .          . |         3
             8 |         1          .          .          . |         1
             9 |         3          .          .          . |         3
            10 |         4          .          .          . |         4
    -----------+--------------------------------------------+----------
         Total | 3.7058824          3          2          1 | 3.2307692

    Comment


    • #3
      accidently duplicated so deleted

      Comment


      • #4
        I think one issue is the difference between counting over households and over individuals.

        Thus for example 7/26 individuals live in households with indigenous heads, but the fraction of such households is 4/10.

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input byte(id_individual id_household members ethnic relation)
         1  1 3 1 1
         2  1 3 1 2
         3  1 3 2 3
         4  2 3 2 1
         5  2 3 2 2
         6  2 3 2 3
         7  3 2 3 1
         8  3 2 3 3
         9  4 1 4 1
        10  5 1 4 1
        11  6 5 1 1
        12  6 5 1 2
        13  6 5 1 3
        14  6 5 1 4
        15  6 5 1 5
        16  7 3 1 1
        17  7 3 1 3
        18  7 3 2 4
        19  8 1 1 1
        20  9 3 1 1
        21  9 3 1 2
        22  9 3 1 3
        23 10 4 1 1
        24 10 4 1 2
        25 10 4 1 3
        26 10 4 1 4
        end
        
        egen tag = tag(id_household)
        
        egen head_indig = max(relation == 1 & ethnic > 1), by(id_household)
        
        tab head_indig 
        
        tab head_indig if tag
        . tab head_indig

        Code:
         head_indig |      Freq.     Percent        Cum.
        ------------+-----------------------------------
                  0 |         19       73.08       73.08
                  1 |          7       26.92      100.00
        ------------+-----------------------------------
              Total |         26      100.00
        
        . 
        . tab head_indig if tag 
        
         head_indig |      Freq.     Percent        Cum.
        ------------+-----------------------------------
                  0 |          6       60.00       60.00
                  1 |          4       40.00      100.00
        ------------+-----------------------------------
              Total |         10      100.00

        For some technique see for example

        https://journals.sagepub.com/doi/pdf...867X1101100210

        https://www.stata.com/support/faqs/d...ble-recording/

        Comment


        • #5
          Thank you very much Rich and Nick.
          Your guidance has helped me to synthesize what I was looking for.
          Regards

          Moses

          Comment

          Working...
          X