I have a panel dataset and want to create a random dummy variable with mean 0.2 that marks a random subpart of the dataset. This would work with some code like this
However, I want to impose the restriction that each individual i and each time period t is marked at least once by this dummy, i.e. is present in the sample. Any ideas on how to approach this?
Note: I tried to use the gsample command by B. Jann, but it does not seem to work for this
Code:
set seed 2803 gen id = _n gen random = runiform() sort random local cut = round((_N / 10) * 2) gen sample = _n <= `cut' drop random
Note: I tried to use the gsample command by B. Jann, but it does not seem to work for this
Code:
clear use http://www.stata-press.com/data/r16/grunfeld.dta distinct company distinct time xtset company year gsample 20, wor cluster(company year) percent alt distinct company distinct time
Code:
clear use http://www.stata-press.com/data/r16/grunfeld.dta distinct company distinct time xtset company year gsample 20, wor strat(company year) percent alt distinct company distinct time
Comment