Hi all,
I am interesting in creating a simulated dataset; I would like to generate two binary variables which are correlated (rho=0.4) and I thought the -corr2data- command could achieve this; however, this does not seem to work well as the generated variables are not correlated. Any advice on what's going wrong?
I am interesting in creating a simulated dataset; I would like to generate two binary variables which are correlated (rho=0.4) and I thought the -corr2data- command could achieve this; however, this does not seem to work well as the generated variables are not correlated. Any advice on what's going wrong?
Code:
clear set seed 6710789 local n = 1000 local intercept6 = 0.20 local intercept12 = 0.35 local or_sex = 1.05 local or_score = 1.10 local b_sex = log(`or_sex') local b_score = log(`or_score') matrix input C = (1 0.20 -0.10 \ 0.20 1 -0.10 \ -0.10 -0.10 1) matlist C drawnorm age lat sexr, double corr(C) n(`n') corr age lat sexr generate byte score = 0 forvalues cut = 1/10 { quietly replace score = score + 1 if invnormal(`cut' / 11) < lat } g sex = runiform() < logistic(sexr) * Adults aged 18 to 65 quietly replace age = floor((65 - 18 + 1) * normal(age) + 18) *** Generate binary variables corr2data r1 r2, corr(1 0.4 \ 0.4 1) means(`intercept6', `intercept12') * Binary variables sum r1 local r6mean = r(mean) sum r2 local r12mean = r(mean) g temp = (logit(`r12mean') + /// sex* `b_sex' + /// score*`b_score') generate recovery12= runiform() < logistic(temp) capture drop temp g temp = (logit(`r6mean') + /// sex* `b_sex' + /// score*`b_score') generate recovery6= runiform() < logistic(temp) drop temp tab1 recovery6 recovery12 polychoric recovery6 recovery12
Comment