Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • r(506) not positive (semi)definite error with drawnorm

    Hi everyone.

    I am currently trying to run a Monte-Carlo analysis on a meta-dataset to calculate standard errors after creating simulated coefficients. My drawnorm code has been working on all (~600) observations in my dataset except for two. For these two observations, I get error r(506) - matrix not positive (semi)definite. I can't figure out quite what the problem is. I have almost no experience with matrix math, so I'm hoping someone here can help me troubleshoot the problem.

    Code:
    drawnorm "`coef_list'", means(m) cov(`cov_mat') cstorage(full)
    For one of the troublesome observations, `coef_list' holds: coef_1 coef_2 coef_3 coef_4 coef_5, which are all the variables I'm trying to fill in with this command.
    the means matrix `m' :
    Code:
    m[1,5]
               wq_1        wq_2          wq_3          wq_4        wq_5
    r1        -.172        -.018         .022         .002         .071
    the covariance matrix:
    Code:
                 wq_1      wq_2          wq_3        wq_4          wq_v5
    wq_1      .001867
    wq_2       -.0004      .000589
    wq_3      -.00024     .0000674     .0000399
    wq_4     -.000086   -9.900e-07    9.880e-07     .0000146
    wq_5     -.000063     -.000014     -.000012     .0000175      .000218
    I'm hoping someone has an idea of where the problem is. I didn't generate the covariance matrix myself, it was provided by the author of the original regression that I am simulating coefficients for. I don't know if something is wrong with my code, the matrix, or if Stata is just reading something wrong. I appreciate any insights anyone can provide. Thanks.
    Last edited by Oliver Lane; 10 Jan 2018, 10:02.

  • #2
    Well, Stata is correct. That matrix is not positive semi-definite.
    Code:
    . * Example generated by -dataex-. To install: ssc install dataex
    . clear
    
    . input str4 var1 float(wq_1 wq_2 wq_3 wq_4 wq_v5)
    
              var1       wq_1       wq_2       wq_3       wq_4      wq_v5
      1. "wq_1"  .001867         .        .        .       .
      2. "wq_2"   -.0004   .000589        .        .       .
      3. "wq_3"  -.00024  .0000674 .0000399        .       .
      4. "wq_4" -.000086 -9.90e-07 9.88e-07 .0000146       .
      5. "wq_5" -.000063  -.000014 -.000012 .0000175 .000218
      6. end
    
    . 
    . mvencode wq_*, mv(0)
            wq_2: 1 missing value recoded
            wq_3: 2 missing values recoded
            wq_4: 3 missing values recoded
           wq_v5: 4 missing values recoded
    
    . 
    . mkmat wq_*, matrix(M) rownames(var1)
    
    . 
    . matrix M = (M+M')
    
    . 
    . forvalues i = 1/5 {
      2.         matrix M[`i', `i'] = 0.5*M[`i', `i']
      3. }
    
    . 
    . matrix list M
    
    symmetric M[5,5]
                 wq_1        wq_2        wq_3        wq_4        wq_5
     wq_1     .001867
     wq_2      -.0004     .000589
     wq_3     -.00024    .0000674    .0000399
     wq_4    -.000086  -9.900e-07   9.880e-07    .0000146
    wq_v5    -.000063    -.000014    -.000012    .0000175     .000218
    
    . 
    . matrix symeigen eigenvectors eigenvalues = M
    
    . matrix list eigenvalues
    
    eigenvalues[1,5]
                e1          e2          e3          e4          e5
    r1   .00201841   .00047923   .00021525   .00001598  -3.787e-07
    Its fifth eigenvalue is negative. The covariance of a multivariate normal distribution must be a positive semi-definite matrix.

    -drawnorm- has a -forcepsd- option which will replace the negative eigenvalue by 0 and then reconstruct a covariance matrix from that. Since this negative eigenvalue is very close to zero, and about 1/50th as large in magnitude as the next smallest eigenvalue, I think that you will not go wrong by using this option in this case.

    That said, you might contact the author of the original regression to find out how that matrix was created. The covariance matrix of coefficients from a regression is necessarily positive semi-definite. Perhaps it is a matter of rounding error, since it is so very close to that here.

    Comment


    • #3
      Thank you, that is extremely helpful. I didn't realize there is a forcepsd option. I just reran things with that option and everything is running smoothly again. I will check the eigenvalue for the other observation that is giving me this error and see if the value is similarly small. If so, I think I will be fine assuming that there is a rounding error.

      Thanks again Clyde - I've been banging my head up against a wall on this for the better part of a day.

      Comment

      Working...
      X