Hi all,
First, I'm sorry if I have failed to find an existing thread discussing this issue.
I am attempting to run a simulation where I allow variables to be correlated within groups. I have found an example that I repost here that solves the problem with correlated variables, but I am still at a loss understanding how I can use this approach while still grouping observations non-randomly. Say if I want to simulate data with students in different classrooms where student characteristics are correlated within classrooms.
Thank you.
* Set up the steps you want to repeat for the simulation in a program program define myprog2 * drop all variables to create an empty dataset, do not use clear drop _all * create a vector that contains the equivalent of a lower triangular correlation matrix matrix c = (1, 0.5968, 1, 0.6623, 0.6174, 1) * create a vector that contains the means of the variables matrix m = (52.23,52.775,52.645) * create a vector that contains the standard deviations matrix sd = (10.25,9.47,9.36) * draw a sample of 1000 cases from a normal distribution with specified correlation structure * and specified means and standard deviations drawnorm x1 x2 y, n(1000) corr(c) cstorage(lower) means(m) sds(sd) * run the desired command reg y x1 x2 end
* use the simulate command to rerun myprog2 1000 times * collect the betas (_b) and standard errors (_se) from the regression each time * You'll probably want to set reps(10) for testing, then set it higher for the simulation. simulate _b _se, reps(1000): myprog2
First, I'm sorry if I have failed to find an existing thread discussing this issue.
I am attempting to run a simulation where I allow variables to be correlated within groups. I have found an example that I repost here that solves the problem with correlated variables, but I am still at a loss understanding how I can use this approach while still grouping observations non-randomly. Say if I want to simulate data with students in different classrooms where student characteristics are correlated within classrooms.
Thank you.
* Set up the steps you want to repeat for the simulation in a program program define myprog2 * drop all variables to create an empty dataset, do not use clear drop _all * create a vector that contains the equivalent of a lower triangular correlation matrix matrix c = (1, 0.5968, 1, 0.6623, 0.6174, 1) * create a vector that contains the means of the variables matrix m = (52.23,52.775,52.645) * create a vector that contains the standard deviations matrix sd = (10.25,9.47,9.36) * draw a sample of 1000 cases from a normal distribution with specified correlation structure * and specified means and standard deviations drawnorm x1 x2 y, n(1000) corr(c) cstorage(lower) means(m) sds(sd) * run the desired command reg y x1 x2 end
* use the simulate command to rerun myprog2 1000 times * collect the betas (_b) and standard errors (_se) from the regression each time * You'll probably want to set reps(10) for testing, then set it higher for the simulation. simulate _b _se, reps(1000): myprog2
Comment