Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • drawnorm

    Dear all,
    I have trying to generate three correlated variables from 3 different normal distribution functions, which seems fine to do with draw norm.
    My problem is the following: lets call the variables x, y, z and I want x and z to be positively correlated and I want y and z to be positively correlated, but I want there not to be a correlation between x and y. If you set the parameter 0 in the correlation matrix, then you get an error (c not positive (semi)definite), what solution would there possibly be to this problem?

  • #2
    Without seeing your code, it is impossible to comment why you received an error; however, this example works:

    Code:
    . matrix input c = (1,0,.5\0,1,.5\.5,.5,1)
    
    . drawnorm x y z, n(10000) corr(c) clear
    (obs 10,000)
    
    . corr
    (obs=10,000)
    
                 |        x        y        z
    -------------+---------------------------
               x |   1.0000
               y |  -0.0099   1.0000
               z |   0.4886   0.4933   1.0000

    Comment


    • #3
      Dear Scott,
      Thanks for your input. I have attached my code below...still not working

      Code:
      matrix m=(3.910824443128,3.21568602139,0)
      matrix sd=(0.04896043912,0.0798724418,1)
      matrix c=(1, 0,0.8\0, 1,0.65\0.8,0.65,1)
      drawnorm lnK lnL A, n(400) corr(c) means(m) sds(sd)

      Comment


      • #4
        Depending on what you are doing with this, you can force the creation of the data with the forcepsd option, but there can be consequences for your analysis. Read: Not Positive Definite Matrices--Causes and Cures
        Stata/MP 14.1 (64-bit x86-64)
        Revision 19 May 2016
        Win 8.1

        Comment


        • #5
          This is adapted from Rick Wicklin's http://blogs.sas.com/content/iml/201...on-matrix.html

          See also: http://eprints.ma.man.ac.uk/232/01/c..._ep2006_70.pdf

          Code:
          //Project correlation matrix to nearest definite matrix
          //Adapted from http://blogs.sas.com/content/iml/2012/11/28/computing-the-nearest-correlation-matrix.html
          //matrix c = (1, .99, .35 \ .99, 1, .8 \.35, .8, 1)
          
          
          matrix m=(3.910824443128,3.21568602139,0)
          matrix sd=(0.04896043912,0.0798724418,1)
          matrix c=(1, 0,0.8\0, 1,0.65\0.8,0.65,1)
          
          mata
          W = .
          L = .
          maxd= 1
          c = st_matrix("c")
          Xold= c
          Yold = c
          ds = J(rows(c), cols(c),0)
          ds
          
          while (maxd > 1e-09) {
              R= Yold - ds
              symeigensystem( R, W, L )
              L2 = L
              for(k=1;k<=cols(L);k++) {
                  L2[1,k] = max( (L2[1,k], 0 ) )
              }
              X = W * diag( L2 ) * W'
              ds = X -R
              Y = X
              for(k=1;k<=cols(c);k++) {
                  Y[k,k] =  1
              }
          
              dx = max(abs(rowsum(X - Xold)))/ max(abs(rowsum(X)))
              dy = max(abs(rowsum(Y - Yold)))/max(abs(rowsum(Y)))
              dxy = max(abs(rowsum(Y-X)))/max(abs(rowsum(Y)))
              maxdxy = dx, dy, dxy
              maxd = max(maxdxy)
              Xold = X
              Yold = Y
          }
          st_matrix( "c_update", Y )
          end
          mat list c_update
          
          drawnorm x y z, n(10000) corr(c_update) means(m) sds(sd) clear

          Comment


          • #6
            Going back to post #1, your example is a very specific correlation matrix, rather than the more abstract wording you used in posing the question.

            But you are not free to choose the elements of a correlation matrix arbitrarily, and Stata has told you that the matrix you chose is not positive semidefinite, meaning it is not a legitimate correlation matrix, any more than if one of the correlations were outside the range [-1,+1].

            The failure to be positive semidefinite can be confirmed by calculating the eigenvalues of c as below, and noticing that the third one is negative. Carole and Scott have pointed you in the direction of finding a positive semidefinite matrix close to the one you specified.

            I wanted to make the inability to arbitrarily array correlations in a symmetric matrix explicit in this thread, to benefit others who may land upon it.
            Code:
            . matrix list c
            
            symmetric c[3,3]
                 c1   c2   c3
            r1    1
            r2    0    1
            r3   .8  .65    1
            
            . matrix symeigen X v = c
            
            . matrix list v
            
            v[1,3]
                        e1          e2          e3
            r1   2.0307764           1  -.03077641
            
            .
            Last edited by William Lisowski; 12 May 2016, 21:10.

            Comment


            • #7
              As William Lisowski says in #6, you are not free to choose the elements of a correlation matrix arbitrarily. The admissible range of correlations between two variables i and j is co-determined by the correlations of all other variables with i and j.

              You can check the admissible range by calculating the limiting values of r(vi,vj) using:
              • limit1 = cos(acos(vi,vx)+acos(vj,vx))
              • limit2 = cos(acos(vi,vx)-acos(vj,vx))
              where vi stands for variable i and vj stands for variable j and vx stands for any arbitrary third variable in the same correlation matrix.

              Example:
              Code:
              . * (1) Create an "admissable" (positive definite) correlation matrix:
              . matrix input C = (1.00,  .62, -.05\ ///
              >                    .62, 1.00,  .75\ ///
              >                   -.05,  .75, 1.00)
              
              . matrix rownames C = v1 v2 v3
              
              . matrix colnames C = v1 v2 v3
              
              . matlist C
              
                           |        v1         v2         v3
              -------------+---------------------------------
                        v1 |         1                      
                        v2 |       .62          1            
                        v3 |      -.05        .75          1
              
              . * All eigenvalues are positive:
              . matrix symeigen X v = C
              
              . matlist v
              
                           |        e1         e2         e3
              -------------+---------------------------------
                        r1 |  1.948892   1.049103   .0020053
              
              .
              . /* (2) Find the limits for r(v1,v2) of a positive definite correlation matrix
              >        given r(v1,v3) and r(v2,v3): */
              . di "the range of r(v1,v2) is limited between " ///
              >    cos(acos(C[1,3])+acos(C[2,3])) " and " ///
              >    cos(acos(C[1,3])-acos(C[2,3]))
              the range of r(v1,v2) is limited between -.69811051 and .62311051
              
              .
              . /* (3) set r(v1,v2) outside the limits to create a "not admissable"
              >        (non positive definite) correlation matrix: */
              . matrix C[1,2] = .65
              
              . matrix C[2,1] = .65
              
              . matlist C
              
                           |        v1         v2         v3
              -------------+---------------------------------
                        v1 |         1                      
                        v2 |       .65          1            
                        v3 |      -.05        .75          1
              
              . * There are negative eigenvalues:
              . matrix symeigen X v = C
              
              . matlist v
              
                           |        e1         e2         e3
              -------------+---------------------------------
                        r1 |  1.968061    1.04949  -.0175506
              
              . * Consequently, -factormat- using C will fail:
              . factormat C, n(100)
              C not positive (semi)definite
              For a more detailed discussion in the context of structural equation modeling (SEM) see:

              Wothke, W. (1993). Nonpositive definite matrices in structural modeling. In K.A. Bollen & J.S. Long (Eds.), Testing Structural Equation Models (pp. 256-293). Newbury Park, CA: Sage.
              Last edited by Dirk Enzmann; 13 May 2016, 07:48.

              Comment

              Working...
              X