Random dummy variable with restrictions

Felix Stips

Join Date: Nov 2014

Posts: 110
#1

Random dummy variable with restrictions

07 Jan 2021, 02:39

I have a panel dataset and want to create a random dummy variable with mean 0.2 that marks a random subpart of the dataset. This would work with some code like this

Code:

set seed 2803 gen id = _n gen random = runiform() sort random local cut = round((_N / 10) * 2) gen sample = _n <= `cut' drop random

However, I want to impose the restriction that each individual i and each time period t is marked at least once by this dummy, i.e. is present in the sample. Any ideas on how to approach this?

Note: I tried to use the gsample command by B. Jann, but it does not seem to work for this

Code:

clear use http://www.stata-press.com/data/r16/grunfeld.dta distinct company distinct time xtset company year gsample 20, wor cluster(company year) percent alt distinct company distinct time

Code:

clear use http://www.stata-press.com/data/r16/grunfeld.dta distinct company distinct time xtset company year gsample 20, wor strat(company year) percent alt distinct company distinct time

Last edited by Felix Stips; 07 Jan 2021, 03:34.
Tags: None

Felix Stips

Join Date: Nov 2014
Posts: 110

07 Jan 2021, 05:28

Okay, so first of all using the ssc command randomtag by Robert Picard makes this easier as the data is kept in place. Secondly, one approach to do this is to pick one observation from each group and then generate the remaining observations randomly until we reach target sample size.

Code:

set seed 12345
use http://www.stata-press.com/data/r16/grunfeld.dta

local percent = 20
local samplesize = round((_N / 100 ) * `percent')

egen company1 = tag(company)
qui count if company1 == 1
local n1 = r(N)

egen time1 = tag(time)
qui count if time1 == 1
local n2 = r(N)

local n3 = `samplesize' - `n1' - `n2'

randomtag if company1 == 0 & time1 == 0, count(`n3') gen(sample)
replace sample = 1 if time1 == 1 | company1 == 1
drop time1 company1

Last edited by Felix Stips; 07 Jan 2021, 05:35.

Announcement

Random dummy variable with restrictions

Comment