Dear all,
Although I am quite aware of Stata's powerful drawnorm function to generate bivariate normal distributed data, I tried using a property of bivariate normal distributions concerning the conditional distribution of X2 given X1 = x1.
If X1 and X2 have a bivariate normal distribution with means m1 and m2, variances s12 and s22 and correlation r, then the conditional distribution of X2 given X1 = x1 is itself normal distributed with mean = m2 + r(s2/s1)(x1 - m1) and variance = (1 - r2)s22 (see e.g. Bickel and Doksum, 1977, page 26). In the case that the marginal distributions are standard normal distributions (i.e. m1 = m2 = 0 and s1 = s2 = 1) this implies that the conditional distribution of X2 given X1 = x1 is normal distributed with mean = rx1 and variance = 1 - r2.
However, when using the following code:
I end up with X2 having standard deviation of 0.87 (instead of 1) and X1 and X2 having a correlation of 0.81 (instead of 0.7).
Am I missing something? Can anyone explain what is wrong here?
Kind regards, Adriaan Hoogendoorn
REFERENCE: Bickel, Peter J & Doksum, Kjell A (1977) Mathematical Statistics: Basic Ideas and Selected Topics, Holden-Day Inc., Oakland, California.
Although I am quite aware of Stata's powerful drawnorm function to generate bivariate normal distributed data, I tried using a property of bivariate normal distributions concerning the conditional distribution of X2 given X1 = x1.
If X1 and X2 have a bivariate normal distribution with means m1 and m2, variances s12 and s22 and correlation r, then the conditional distribution of X2 given X1 = x1 is itself normal distributed with mean = m2 + r(s2/s1)(x1 - m1) and variance = (1 - r2)s22 (see e.g. Bickel and Doksum, 1977, page 26). In the case that the marginal distributions are standard normal distributions (i.e. m1 = m2 = 0 and s1 = s2 = 1) this implies that the conditional distribution of X2 given X1 = x1 is normal distributed with mean = rx1 and variance = 1 - r2.
However, when using the following code:
Code:
clear set seed 123 set obs 1000000 local rho=0.7 gen double x1 = rnormal(0,1) gen double x2 = rnormal(`rho' * x1, 1-`rho'^2) summ x* pwcorr x1 x2
Am I missing something? Can anyone explain what is wrong here?
Kind regards, Adriaan Hoogendoorn
REFERENCE: Bickel, Peter J & Doksum, Kjell A (1977) Mathematical Statistics: Basic Ideas and Selected Topics, Holden-Day Inc., Oakland, California.
Comment