calculating household size

rumana akter

Join Date: Jul 2018

Posts: 38
#1

calculating household size

13 Jul 2018, 18:09

Hi,

Can anyone help me calculating household size by household ID (not repeated hhsize by repeated hhid as below) from the below date example:
Here 'hhid' means 'household ID' and 'pid' means 'person ID'
clear
input float hhsize double hhid float pid
4 1 1
4 1 1
4 1 2
4 1 3
4 1 4
3 2 1
3 2 2
3 2 3
4 3 1
4 3 2
4 3 3
end

Thank you so much, Rumana
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30003
#2

13 Jul 2018, 18:18

I don't understand what you are looking for. The variable hhsize correctly gives the size of each household (i.e. number of observations with the same hhid). If this isn't what you are looking for, what is wrong with it, or what do you want instead?
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4447
#3

13 Jul 2018, 18:24

like Clyde, I am confused about what you want - I am particularly confused by the first two lines of your data (they are duplicates); however, it does appear that you don't want what you have, so here is a guess on something else:

Code:

egen hhnum=count(1), by(hhid)
Comment

William Lisowski

Join Date: Dec 2014
Posts: 10150

13 Jul 2018, 18:32

Actually, I fail to understand why hhsize is 4 for hhid 3.

With that said, perhaps this is what is desired - one observation per hhid.

Code:

clear
input float hhsize double hhid float pid
4 1 1
4 1 1
4 1 2
4 1 3
4 1 4
3 2 1
3 2 2
3 2 3
4 3 1
4 3 2
4 3 3
end
egen person = tag(hhid pid)
collapse (sum) hhsize2=person, by(hhid)
list, clean noobs

Code:

. list, clean noobs

    hhid   hhsize2  
       1         4  
       2         3  
       3         3

Comment

rumana akter

Join Date: Jul 2018

Posts: 38
#5

13 Jul 2018, 18:36

Sorry for the confusion! I want to get household size against each hhid as below

hhid hhsize

1 4

2 3

3 4

I hope it makes sense now.

Thank you,
Rumana
Comment
Carole J. Wilson

Join Date: Jan 2015

Posts: 932
#6

13 Jul 2018, 18:38

Like William, I was thinking along the lines of unique pids within hhid. If you want a new variable in the same dataset (without collapsing):

Code:

egen person=tag(hhid pid) bysort hhid: egen sum_person=sum(person) list, clean noobs

Stata/MP 14.1 (64-bit x86-64)
Revision 19 May 2016
Win 8.1
Comment
rumana akter

Join Date: Jul 2018

Posts: 38
#7

13 Jul 2018, 18:39

Thank you so much William! Yes, this is what I am looking for!
Comment

Announcement

calculating household size

Comment

Comment

Comment

Comment

Comment

Comment