drawnorm

Samira Barzin

Join Date: Apr 2016

Posts: 21
#1

drawnorm

12 May 2016, 07:00

Dear all,
I have trying to generate three correlated variables from 3 different normal distribution functions, which seems fine to do with draw norm.
My problem is the following: lets call the variables x, y, z and I want x and z to be positively correlated and I want y and z to be positively correlated, but I want there not to be a correlation between x and y. If you set the parameter 0 in the correlation matrix, then you get an error (c not positive (semi)definite), what solution would there possibly be to this problem?
Tags: None

Scott Merryman

Join Date: Mar 2014
Posts: 895

12 May 2016, 07:23

Without seeing your code, it is impossible to comment why you received an error; however, this example works:

Code:

. matrix input c = (1,0,.5\0,1,.5\.5,.5,1)

. drawnorm x y z, n(10000) corr(c) clear
(obs 10,000)

. corr
(obs=10,000)

             |        x        y        z
-------------+---------------------------
           x |   1.0000
           y |  -0.0099   1.0000
           z |   0.4886   0.4933   1.0000

Comment

Samira Barzin

Join Date: Apr 2016
Posts: 21

12 May 2016, 10:05

Dear Scott,
Thanks for your input. I have attached my code below...still not working

Code:

matrix m=(3.910824443128,3.21568602139,0)
matrix sd=(0.04896043912,0.0798724418,1)
matrix c=(1, 0,0.8\0, 1,0.65\0.8,0.65,1)
drawnorm lnK lnL A, n(400) corr(c) means(m) sds(sd)

Comment

Carole J. Wilson

Join Date: Jan 2015

Posts: 932
#4

12 May 2016, 10:40

Depending on what you are doing with this, you can force the creation of the data with the forcepsd option, but there can be consequences for your analysis. Read: Not Positive Definite Matrices--Causes and Cures

Stata/MP 14.1 (64-bit x86-64)
Revision 19 May 2016
Win 8.1
Comment

Scott Merryman

Join Date: Mar 2014
Posts: 895

12 May 2016, 12:38

This is adapted from Rick Wicklin's http://blogs.sas.com/content/iml/201...on-matrix.html

See also: http://eprints.ma.man.ac.uk/232/01/c..._ep2006_70.pdf

Code:

//Project correlation matrix to nearest definite matrix
//Adapted from http://blogs.sas.com/content/iml/2012/11/28/computing-the-nearest-correlation-matrix.html
//matrix c = (1, .99, .35 \ .99, 1, .8 \.35, .8, 1)


matrix m=(3.910824443128,3.21568602139,0)
matrix sd=(0.04896043912,0.0798724418,1)
matrix c=(1, 0,0.8\0, 1,0.65\0.8,0.65,1)

mata
W = .
L = .
maxd= 1
c = st_matrix("c")
Xold= c
Yold = c
ds = J(rows(c), cols(c),0)
ds

while (maxd > 1e-09) {
    R= Yold - ds
    symeigensystem( R, W, L )
    L2 = L
    for(k=1;k<=cols(L);k++) {
        L2[1,k] = max( (L2[1,k], 0 ) )
    }
    X = W * diag( L2 ) * W'
    ds = X -R
    Y = X
    for(k=1;k<=cols(c);k++) {
        Y[k,k] =  1
    }

    dx = max(abs(rowsum(X - Xold)))/ max(abs(rowsum(X)))
    dy = max(abs(rowsum(Y - Yold)))/max(abs(rowsum(Y)))
    dxy = max(abs(rowsum(Y-X)))/max(abs(rowsum(Y)))
    maxdxy = dx, dy, dxy
    maxd = max(maxdxy)
    Xold = X
    Yold = Y
}
st_matrix( "c_update", Y )
end
mat list c_update

drawnorm x y z, n(10000) corr(c_update) means(m) sds(sd) clear

Comment

William Lisowski

Join Date: Dec 2014

Posts: 10150
#6

12 May 2016, 21:06

Going back to post #1, your example is a very specific correlation matrix, rather than the more abstract wording you used in posing the question.

But you are not free to choose the elements of a correlation matrix arbitrarily, and Stata has told you that the matrix you chose is not positive semidefinite, meaning it is not a legitimate correlation matrix, any more than if one of the correlations were outside the range [-1,+1].

The failure to be positive semidefinite can be confirmed by calculating the eigenvalues of c as below, and noticing that the third one is negative. Carole and Scott have pointed you in the direction of finding a positive semidefinite matrix close to the one you specified.

I wanted to make the inability to arbitrarily array correlations in a symmetric matrix explicit in this thread, to benefit others who may land upon it.

Code:

. matrix list c symmetric c[3,3] c1 c2 c3 r1 1 r2 0 1 r3 .8 .65 1 . matrix symeigen X v = c . matrix list v v[1,3] e1 e2 e3 r1 2.0307764 1 -.03077641 .

Last edited by William Lisowski; 12 May 2016, 21:10.
Comment

Dirk Enzmann

Join Date: Apr 2014
Posts: 541

13 May 2016, 07:30

As William Lisowski says in #6, you are not free to choose the elements of a correlation matrix arbitrarily. The admissible range of correlations between two variables i and j is co-determined by the correlations of all other variables with i and j.

You can check the admissible range by calculating the limiting values of r(vi,vj) using:

limit1 = cos(acos(vi,vx)+acos(vj,vx))
limit2 = cos(acos(vi,vx)-acos(vj,vx))

where vi stands for variable i and vj stands for variable j and vx stands for any arbitrary third variable in the same correlation matrix.

Example:

Code:

. * (1) Create an "admissable" (positive definite) correlation matrix:
. matrix input C = (1.00,  .62, -.05\ ///
>                    .62, 1.00,  .75\ ///
>                   -.05,  .75, 1.00)

. matrix rownames C = v1 v2 v3

. matrix colnames C = v1 v2 v3

. matlist C

             |        v1         v2         v3
-------------+---------------------------------
          v1 |         1                      
          v2 |       .62          1            
          v3 |      -.05        .75          1

. * All eigenvalues are positive:
. matrix symeigen X v = C

. matlist v

             |        e1         e2         e3
-------------+---------------------------------
          r1 |  1.948892   1.049103   .0020053

.
. /* (2) Find the limits for r(v1,v2) of a positive definite correlation matrix
>        given r(v1,v3) and r(v2,v3): */
. di "the range of r(v1,v2) is limited between " ///
>    cos(acos(C[1,3])+acos(C[2,3])) " and " ///
>    cos(acos(C[1,3])-acos(C[2,3]))
the range of r(v1,v2) is limited between -.69811051 and .62311051

.
. /* (3) set r(v1,v2) outside the limits to create a "not admissable"
>        (non positive definite) correlation matrix: */
. matrix C[1,2] = .65

. matrix C[2,1] = .65

. matlist C

             |        v1         v2         v3
-------------+---------------------------------
          v1 |         1                      
          v2 |       .65          1            
          v3 |      -.05        .75          1

. * There are negative eigenvalues:
. matrix symeigen X v = C

. matlist v

             |        e1         e2         e3
-------------+---------------------------------
          r1 |  1.968061    1.04949  -.0175506

. * Consequently, -factormat- using C will fail:
. factormat C, n(100)
C not positive (semi)definite

For a more detailed discussion in the context of structural equation modeling (SEM) see:

Wothke, W. (1993). Nonpositive definite matrices in structural modeling. In K.A. Bollen & J.S. Long (Eds.), Testing Structural Equation Models (pp. 256-293). Newbury Park, CA: Sage.

Last edited by Dirk Enzmann; 13 May 2016, 07:48.

Announcement

Comment

Comment

Comment

Comment

Comment

Comment