Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • calculating household size

    Hi,

    Can anyone help me calculating household size by household ID (not repeated hhsize by repeated hhid as below) from the below date example:
    Here 'hhid' means 'household ID' and 'pid' means 'person ID'
    clear
    input float hhsize double hhid float pid
    4 1 1
    4 1 1
    4 1 2
    4 1 3
    4 1 4
    3 2 1
    3 2 2
    3 2 3
    4 3 1
    4 3 2
    4 3 3
    end

    Thank you so much, Rumana

  • #2
    I don't understand what you are looking for. The variable hhsize correctly gives the size of each household (i.e. number of observations with the same hhid). If this isn't what you are looking for, what is wrong with it, or what do you want instead?

    Comment


    • #3
      like Clyde, I am confused about what you want - I am particularly confused by the first two lines of your data (they are duplicates); however, it does appear that you don't want what you have, so here is a guess on something else:
      Code:
      egen hhnum=count(1), by(hhid)

      Comment


      • #4
        Actually, I fail to understand why hhsize is 4 for hhid 3.

        With that said, perhaps this is what is desired - one observation per hhid.
        Code:
        clear
        input float hhsize double hhid float pid
        4 1 1
        4 1 1
        4 1 2
        4 1 3
        4 1 4
        3 2 1
        3 2 2
        3 2 3
        4 3 1
        4 3 2
        4 3 3
        end
        egen person = tag(hhid pid)
        collapse (sum) hhsize2=person, by(hhid)
        list, clean noobs
        Code:
        . list, clean noobs
        
            hhid   hhsize2  
               1         4  
               2         3  
               3         3

        Comment


        • #5
          Sorry for the confusion! I want to get household size against each hhid as below


          hhid hhsize

          1 4

          2 3

          3 4


          I hope it makes sense now.


          Thank you,
          Rumana

          Comment


          • #6
            Like William, I was thinking along the lines of unique pids within hhid. If you want a new variable in the same dataset (without collapsing):

            Code:
            egen person=tag(hhid pid)
            bysort hhid: egen sum_person=sum(person)
            list, clean noobs
            Stata/MP 14.1 (64-bit x86-64)
            Revision 19 May 2016
            Win 8.1

            Comment


            • #7
              Thank you so much William! Yes, this is what I am looking for!

              Comment

              Working...
              X