Generating random numbers

Tina Samuel

Join Date: Mar 2023

Posts: 51
#1

Generating random numbers

26 Jul 2023, 01:17

Hello everyone,
I have a dataset for over 600 households. Here's an example of it:

Key | Parent | hh_members_count | Index | name | age
123AA57begs Sofia 3 1 Sam 37
123AA57begs Sofia 3 2 Nancy 15
123AA57begs Sofia 3 3 Mark 2
983aM04bb5z Karma 3 1 Joseph 38
983aM04bb5z Karma 3 2 Hariot 4
983aM04bb5z Karma 3 3 Kevin 1.5

I would like to create a random variable called hhid that repeats for each individual but at the same time is unique for each household. Here's an example:

Key | Parent | hh_members_count | Index | name | age | hhid
123AA57begs Sofia 3 1 Sam 37 2517
123AA57begs Sofia 3 2 Nancy 15 2517
123AA57begs Sofia 3 3 Mark 2 2517
983aM04bb5z Karma 3 1 Joseph 38 3089
983aM04bb5z Karma 3 2 Hariot 4 3089
983aM04bb5z Karma 3 3 Kevin 1.5 3089

The hhid shouldn't be cumulative (i.e. not 1, 2, 3...). Also it should have a minimum of 3 numbers and a maximum of 5 numbers. In other words the hhid should be as unique as the key except that it takes numerical values only.
I would greatly appreciate your help in this regard. Thank you.
Tags: None

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17730

26 Jul 2023, 01:33

Tina:
you may want to consider the following toy-example:

Code:

. g family=1

. g hh_members=_n

. expand 2

. replace family=2 in 3/4

. label define hh_members 1 "mother" 2 "daughter"

. label val hh_members hh_members

. bysort family (hh_members): gen wanted=runiform() if _n==1


. bysort family ( hh_members): replace wanted=wanted[1] if wanted==.


. list

     +------------------------------+
     | family   hh_mem~s     wanted |
     |------------------------------|
  1. |      1     mother   .3488717 |
  2. |      1   daughter   .3488717 |
  3. |      2     mother   .2668857 |
  4. |      2   daughter   .2668857 |
     +------------------------------+

.

Kind regards,
Carlo
(Stata 19.0)

Comment

Nick Cox

Join Date: Mar 2014
Posts: 35779

26 Jul 2023, 01:52

There might well be a simpler method, but your constraints

1. Between 100 and 99999

2. Matching households uniquely, not individuals

3. Random otherwise

all have to be satisfied.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input str11 Key str5 Parent byte(hh_members_count Index) str6 name double age
"123AA57begs" "Sofia" 3 1 "Sam"     37
"123AA57begs" "Sofia" 3 2 "Nancy"   15
"123AA57begs" "Sofia" 3 3 "Mark"     2
"983aM04bb5z" "Karma" 3 1 "Joseph"  38
"983aM04bb5z" "Karma" 3 2 "Hariot"   4
"983aM04bb5z" "Karma" 3 3 "Kevin"  1.5
end

save SAFECOPY 

bysort Key : keep if _n == 1 
save KEY 
count 
local N = r(N)

clear 
set obs 99900 
range id 100 99999
set seed 2803 
gen double rnd = runiform()
sort rnd 
keep in 1/`N'

merge 1:1 _n using KEY 
assert _merge == 3 
drop _merge 

merge 1:m Key using SAFECOPY 

list

Announcement

Generating random numbers

Comment

Comment