generating dummy for multiple occurrence variable DHS HR

Ece Kafali

Join Date: May 2018

Posts: 5
#1

generating dummy for multiple occurrence variable DHS HR

22 May 2018, 07:07

Hi !

I am using a DHS Household data. I have one record for each household. This is an example of how the data looks like:

I want to create a dummy variable =1 if there are one or more members older than 60 in a household (if age_* >= 60), but I cannot figure out a quick way to do it rather than creating 27 dummies for each age_*.
Any advice would be greatly appreciated.

Regards
Tags: None
Baptiste Ottino

Join Date: May 2018

Posts: 21
#2

22 May 2018, 08:13

Hello Ece!

Assuming noone in your set is older than 120, you could use:

Code:

egen flag=anymatch(age_*), values(60/120)

It will return 1 for any observation with at least one family member aged 60 or older, and 0 otherwise.

edit:

I realize that checking 60 possibilities everytime is a bit of a hassle. So you could also, to the same effect:

Code:

egen flag=rowmax(age_*) replace flag=0 if flag < 60 replace flag=1 if flag >= 60

Up to you.

Last edited by Baptiste Ottino; 22 May 2018, 08:59.
Comment
Ece Kafali

Join Date: May 2018

Posts: 5
#3

22 May 2018, 13:27

Dear Baptiste,

Thank you very much.

I'm sorry, I just noticed that I forgot to mention something. I want to generate this dummy if the household member (who is older than 60) is female. Same as age_*, I have a sex_* variable for each observation in the household.
Could you help me with that?

Thanks again.
Regards.
Comment

Baptiste Ottino

Join Date: May 2018
Posts: 21

23 May 2018, 01:36

Hello Ece,

This gets a bit more complicated. You could for example switch from wide to long and back to evaluate your condition. Your dataset looks like this:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(household_id age_1 age_2 age_3 sex_1 sex_2 sex_3)
1 64  .  . 2 . .
2 45 43 12 1 2 1
3 64 62  . 1 2 .
4 64 59 25 1 2 2
end

Where 1 is male and 2 is female. Do:

Code:

* Reshapes wide to long
reshape long age_ sex_, i(household_id)

*  Generates a temp variable with 1 if 60 and older and female. missing if not
gen temp = 1 if age_ >= 60 & sex_ == 2

* Generates a per-household flag variable based on temp, and drops temp
bysort household: egen flag = max(temp)
drop temp

* Reshape long to wide
reshape wide

The result is a flag variable with 1 when your condition is met, missing if not. Hope this helps.

Comment

Nick Cox

Join Date: Mar 2014
Posts: 35724

23 May 2018, 02:22

For a riff on rowwise operations see https://www.stata-journal.com/sjpdf....iclenum=pr0046

pr0046 is thus revealed as an otherwise unpredictable search term for this forum.

One of the points raised there is that often the best strategy if you find yourself doing this a lot is to reshape long -- and stay that way. Personally I would do that with household or family data of this kind.

Another is just to write your own loop. With Baptiste's excellent data example (Ece: please note that we ask for such; FAQ Advice #12),

Code:

clear 

input float(household_id age_1 age_2 age_3 sex_1 sex_2 sex_3)
1 64  .  . 2 . .
2 45 43 12 1 2 1
3 64 62  . 1 2 .
4 64 59 25 1 2 2
end

gen count = 0 

quietly forval j = 1/3 { 
    replace count = count + (inrange(age_`j', 60, .) & (sex_`j' == 2)) 
} 

gen anyfemGE60 = count >= 1 

     +-----------------------------------------------------------------------------+
     | househ~d   age_1   age_2   age_3   sex_1   sex_2   sex_3   count   anyfe~60 |
     |-----------------------------------------------------------------------------|
  1. |        1      64       .       .       2       .       .       1          1 |
  2. |        2      45      43      12       1       2       1       0          0 |
  3. |        3      64      62       .       1       2       .       1          1 |
  4. |        4      64      59      25       1       2       2       0          0 |
     +-----------------------------------------------------------------------------+

I recommend (0, 1) indicators, not (1, .) indicators.

See also https://www.stata.com/support/faqs/d...rue-and-false/ (especially if the replace statement above seems cryptic.)

Comment

Ece Kafali

Join Date: May 2018

Posts: 5
#6

28 May 2018, 05:29

Dear Baptiste and Nick,
Thank you very much for your help.
Best regards,
Ece
Comment

Announcement