randomselect

Giovanni Piumatti

Join Date: Sep 2016

Posts: 14
#1

randomselect

11 Feb 2020, 01:16

Hello,

I am trying to randomly select a subsample of participants from my data set. I found the command randomselect useful in this sense, but I don't know how to set seed in my syntax so that the randomly selected observations are the same during subsequent runs of the do file.

Basically, I want to select two groups based on the following characteristics:

Group 1: N=3000, smokers, 50% female, aged 50-80
Group 2: N=3000, non smokers, 50% female, aged 20-80

Here is my syntax (with the seed command integrated but not working as expected):

Code:

randomselect if smoking == 1 & gender == 1, gen(sample_1) n(1500) seed(7492001) randomselect if smoking == 1 & gender == 0 & sample_1 != 1, gen(sample_2) n(1500) seed(7492001) randomselect if smoking == 0 & gender == 1, gen(sample_3) n(1500) seed(7492001) randomselect if smoking == 0 & gender == 0 & sample_1 != 1, gen(sample_4) n(1500) seed(7492001) g sample_smoking = 0 if inlist(1, sample_1, sample_2) replace sample_smoking = 1 if inlist(1, sample_3, sample_4) drop sample_1-sample_4

Thank you in advance for any comment!

Giovanni
Tags: None
Sergiy Radyakin

Join Date: Apr 2014

Posts: 1867
#2

11 Feb 2020, 09:06

See from around slide 40 here:
https://github.com/BPLIM/Workshops/b...y_Radyakin.pdf
1 like
Comment

Announcement