Hello StataList community,
I am working with a dataset that contains two groups: "group1" and "group2." My goal is to create a smaller subsample of "group2" observations that have the same mean values of the binary variables "var1" and "var2" as found in "group1."
It's important to note that the mean values of both "var1" and "var2" are higher for "group1" than for "group2" in the full sample. Consequently, a random subsample of "group2" is inappropriate, and I must carefully select "semi-random" observations from group2 to ensure that the means of var1 and var2 in the group2 subsample are equal to (or close to) the variable means of group1.
(As a side note, the size of the group2 subsample should be about 1/100000 of the size of the full "group2" sample.)
The structure of the dataset is as follows:
I would appreciate any guidance on how to achieve this in Stata. If you could provide me with the necessary code or steps, I would be extremely grateful.
Thank you in advance for your assistance!
Best regards,
Marvin
PS:
I am aware of the commands
and
but none of them seem to do the trick.
I am working with a dataset that contains two groups: "group1" and "group2." My goal is to create a smaller subsample of "group2" observations that have the same mean values of the binary variables "var1" and "var2" as found in "group1."
It's important to note that the mean values of both "var1" and "var2" are higher for "group1" than for "group2" in the full sample. Consequently, a random subsample of "group2" is inappropriate, and I must carefully select "semi-random" observations from group2 to ensure that the means of var1 and var2 in the group2 subsample are equal to (or close to) the variable means of group1.
(As a side note, the size of the group2 subsample should be about 1/100000 of the size of the full "group2" sample.)
The structure of the dataset is as follows:
Code:
clear input byte(group var1 var2) 1 1 0 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 0 1 1 0 0 1 0 0 1 0 0 2 0 0 2 1 0 2 0 0 2 0 0 2 1 0 2 1 1 2 0 1 2 0 0 2 0 0 2 1 0 end
Thank you in advance for your assistance!
Best regards,
Marvin
PS:
I am aware of the commands
Code:
splitsample
Code:
rsz
Comment