random selection between observation

Mehdi Amani

Join Date: Jan 2023

Posts: 7
#1

random selection between observation

11 Jan 2023, 05:05

Hi everyone,
In the below dataset, I want to randomly delete one of the observations that has the same wage for a given year. For example, I would be happy if Stata randomly drops business a or b for year 93. Do you have any idea?
thanks a lot.

year wage business

93 100 a

93 100 b

94 150 c

95 200 d

95 250 e

96 300 f
Tags: None

Joseph Coveney

Join Date: Apr 2014
Posts: 4402

11 Jan 2023, 05:26

Code:

set seed 357993207                     // <= important
generate double randu = runiform()
isid randu                             // <= also important if large dataset
bysort year wage (randu): keep if _n == 1

alternatively

Code:

set seed 928467163
generate double randu = runiform()
isid year wage randu, sort
by year wage:  keep if _n == 1

Comment

Mehdi Amani

Join Date: Jan 2023

Posts: 7
#3

11 Jan 2023, 09:18

Thank you so much. Another question. For example, Iook at the table below. in the 93, because I have same wage for same year, I would like to drop the observation which individual didn't work in the previous year. I mean, I want to keep a business obs for the 93. But for the 94, it should be random as it was before.
many thanks.
year wage business

92 100 a

93 150 a

93 150 b

94 200 c

94 200 d

94 200 h

95 300 e

96 350 f

96 400 g
Comment

year	wage	business
93	100	a
93	100	b
94	150	c
95	200	d
95	250	e
96	300	f

year	wage	business
92	100	a
93	150	a
93	150	b
94	200	c
94	200	d
94	200	h
95	300	e
96	350	f
96	400	g

Announcement

random selection between observation

Comment

Comment