Hello Everyone,
In an upcoming workshop, I intend to provide a demonstration of the following question: Does Randomization really ensures statistically similar samples on unobserved variables? In theory, randomization ensures statistically similar samples on both observe and unobserved characteristics.
To check this, I shall delete the gender variable from a dataset (case1) and select a number of random samples (with replacement). Since my dataset will not have a gender variable, it is considered unobserved in the given setting. I then will included the gender variable in all the samples (manually) to see if my sample gender proportions matches the gender proportions in the original data-set (one having gender variable or case1).
Assuming my original data-set is "case1", can someone please share a list of commands I can use for collecting many many such samples? only then i'll be able to prove that on average that mean of sample proportions will match population proportions.
The list of tasks is as follows:
1) Observe gender proportions in a data-set
2) Delete variable gender
3) Select a random sample
4) Add a column of gender in the sample
5) Observe gender proportions in the sample
6) Repeat
Is there a code that can help me do this in Stata for say 10,000 times? Please be informed that I do want to keep gender proportions from all samples and eventually report mean of all samples?
Thanks!
In an upcoming workshop, I intend to provide a demonstration of the following question: Does Randomization really ensures statistically similar samples on unobserved variables? In theory, randomization ensures statistically similar samples on both observe and unobserved characteristics.
To check this, I shall delete the gender variable from a dataset (case1) and select a number of random samples (with replacement). Since my dataset will not have a gender variable, it is considered unobserved in the given setting. I then will included the gender variable in all the samples (manually) to see if my sample gender proportions matches the gender proportions in the original data-set (one having gender variable or case1).
Assuming my original data-set is "case1", can someone please share a list of commands I can use for collecting many many such samples? only then i'll be able to prove that on average that mean of sample proportions will match population proportions.
The list of tasks is as follows:
1) Observe gender proportions in a data-set
2) Delete variable gender
3) Select a random sample
4) Add a column of gender in the sample
5) Observe gender proportions in the sample
6) Repeat
Is there a code that can help me do this in Stata for say 10,000 times? Please be informed that I do want to keep gender proportions from all samples and eventually report mean of all samples?
Thanks!
Comment