Why does mvnormal() occasionally produce missing values?

Chi-lin Tsai

Join Date: Mar 2017

Posts: 17
#1

Why does mvnormal() occasionally produce missing values?

26 Apr 2024, 10:51

Dear all,

I am using Stata SE 17.0, update level 13 Feb 2024. I found that the mvnormal-family commands, e.g., mvnormal(), mvnormalqp(), occasionally produce missing values. Here is an example:

Code:

m: U =0, 5, 5, -10 //upper limits m: R =1, 0, 0, 0, 1, .5, .5, 1, .5, 1 //vectorized correlation matrices m: mvnormal(U, R) //returns a missing value . m: mvnormalqp(U, R, 5000) //returns a missing value, even if the maximum number of quadrature points is specified .

It seems that this issue occurs not because the upper limits are too extreme. As can be seen from the examples below, both MoreExtremeU and LessExtremeU work just fine. mvnormalqp() returns non-missing results.

Code:

m: R =1, 0, 0, 0, 1, .5, .5, 1, .5, 1 m: MoreExtremeU =0, 5, 5, -10-1 m: mvnormalqp(MoreExtremeU, R, 5000) 6.89514e-60 m: LessExtremeU =0, 5, 5, -10+1 m: mvnormalqp(LessExtremeU, R, 5000) 3.20780e-20

So far I have only found this issue when the theoretical result is close to 0, but I am not sure if the same issue would occur when the result is close to 1, so I am still hesitating over whether I should replace all missing values with zero or not.

I use mvnormalqp() to write my maximum likelihood estimation model (a lf-type evaluator programmed in Mata and used in Stata command "ml model"). I also use mvnormalqp() to write my post-estimation predict programme, which is then automatically used in the official command margins. Therefore, my questions are:

1) Is there any way to avoid mvnormalqp() generating missing values?

2) Does the missing-value issue of mvnormalqp() matter to the estimation of ml model and margins? I know that ml model with a lf-type evaluator has some protection against missing values in the log-likelihood function, but does margins do similar protection too?

Thank you.

Chi-lin

Last edited by Chi-lin Tsai; 26 Apr 2024, 10:57. Reason: Add tags
Tags: margins, ml model, mvnormal, predicted probability

Chi-lin Tsai

Join Date: Mar 2017
Posts: 17

05 May 2024, 04:18

After some experimentations, I found two additional points:

First, the missing-value issue is even more severe, when the lower limits are explicitly specified. When mindouble() is specified as the lower limit (i.e., the smallest value that can be stored in storage type double), mvnormalqp() does return a non-missing value, but it is negative, which is nonsense, as the probability can't be negative:

Code:

m: U =0, 5, 5, -10 //upper limits
m: R =1, 0, 0, 0,  1, .5, .5,  1, .5,  1 //vectorized correlation matrices

m: L =J(1,cols(U),mindouble()) //the lower limits
m: M =J(1,cols(U),0) //means

m: mvnormalqp(     U,    R, 5000), ///
   mvnormalqp(  L, U,    R, 5000), /// //specify lower limits
   mvnormalcvqp(L, U, M, R, 5000)      //specify means as well

                1              2              3
  +----------------------------------------------+
1 |             .   -1.28075e-25   -1.28075e-25  |
  +----------------------------------------------+

Second, by comparing the results produced by mvnormalqp() and ghk(), it is almost sure that the missing-value issue occurs, only when the probability is close to 0.

Code:

mata:
rseed(123)
N =100
U =runiform(N, 4, -20, 20) //randomly generate upper limits
L =J(1,cols(U),mindouble()) //lower limits
M =J(1,cols(U),0) //means
i =1
while (i>0) { //randomly generate a positive definite correlation matrix
    R =1, runiform(1, 3, -.5, .5),
       1, runiform(1, 2, -.5, .5),
       1, runiform(1, 1, -.5, .5), 1
    eigensystem(invvech(R'), Evec=., Eval=.)
    i =Eval[1]<0 + Eval[2]<0 + Eval[3]<0 + Eval[4]<0
}
S =ghkfast_init(N, 5000, 4, "hammersley")
P =mvnormalqp(     U,    R, 5000),
   mvnormalqp(  L, U,    R, 5000), //specify lower limits
   mvnormalcvqp(L, U, M, R, 5000), //specify means as well
   ghkfast(S, U, invvech(R'))      //ghk algorithm
end

m: colmissing(P) //display num of missing-value cases
                 1              2              3              4
   +-------------------------------------------------------------+
 1 |            31              0              0              0  |
   +-------------------------------------------------------------+

m: P[selectindex(P[,1]:==.),] //display missing-value cases

                 1              2              3              4
   +-------------------------------------------------------------+
 1 |             .   -9.04304e-13   -9.04304e-13    1.28863e-12  |
 2 |             .   -4.73616e-61   -4.73616e-61    2.64472e-87  |
 3 |             .   -1.02131e-10   -1.02131e-10    9.42588e-53  |
 4 |             .   -1.18889e-58   -1.18889e-58    6.12344e-89  |
 5 |             .   -2.39505e-36   -2.39505e-36    2.28163e-38  |
 6 |             .   -3.32988e-18   -3.32988e-18    1.87886e-64  |
 7 |             .   -1.30611e-21   -1.30611e-21    2.48153e-38  |
 8 |             .   -3.65511e-40   -3.65511e-40    4.26120e-76  |
 9 |             .   -7.31193e-25   -7.31193e-25    2.77654e-33  |
10 |             .   -1.69346e-17   -1.69346e-17    4.25672e-78  |
11 |             .   -9.05412e-27   -9.05412e-27    1.14299e-38  |
12 |             .   -9.32544e-11   -9.32544e-11    2.54511e-22  |
13 |             .   -1.08026e-15   -1.08026e-15    1.97692e-34  |
14 |             .   -4.91895e-45   -4.91895e-45    4.16277e-57  |
15 |             .   -1.17356e-11   -1.17356e-11    1.42053e-50  |
16 |             .   -2.35085e-12   -2.35085e-12    5.89600e-55  |
17 |             .   -2.14321e-16   -2.14321e-16    2.86704e-39  |
18 |             .   -6.93790e-38   -6.93790e-38    1.49411e-58  |
19 |             .   -6.0776e-180   -6.0776e-180    5.9466e-130  |
20 |             .   -1.10515e-12   -1.10515e-12    8.93542e-60  |
21 |             .   -1.80679e-41   -1.80679e-41    4.6324e-105  |
22 |             .   -1.36669e-43   -1.36669e-43    6.75590e-59  |
23 |             .   -4.13118e-12   -4.13118e-12    9.77657e-12  |
24 |             .   -1.25710e-24   -1.25710e-24    6.06826e-29  |
25 |             .   -6.76242e-38   -6.76242e-38    8.55283e-52  |
26 |             .   -7.39241e-57   -7.39241e-57    2.31159e-57  |
27 |             .   -2.60058e-15   -2.60058e-15    1.80489e-45  |
28 |             .   -9.36494e-99   -9.36494e-99    6.5647e-106  |
29 |             .   -8.80583e-61   -8.80583e-61    1.16315e-65  |
30 |             .   -7.18445e-22   -7.18445e-22    5.72767e-31  |
31 |             .   -1.66582e-83   -1.66582e-83    2.50751e-76  |
   +-------------------------------------------------------------+

Desipte these findings, I still have not found a solution to the missing-value issue of mvnormal-family commands. Instead, these findings raise two new questions:

1) What algorithm does mvnormal use to calculate the probability? I thought it was using the ghk algorithm, but apparently it is not (because they produce different results.)

2) Which one -- mvnormal or ghk -- is better in terms of accuracy and speed?

Comments and remarks are welcome. Thank you very much again.

Last edited by Chi-lin Tsai; 05 May 2024, 04:25.

Comment

John Mullahy

Join Date: Dec 2016

Posts: 751
#3

07 May 2024, 07:58

Chi-lin: I was able to replicate the first missing result in #1 using v18.0 on a Macbook running Sonoma 14.4.1.

Since no one has yet offered an explanation or solution on Statalist I would suggest contacting Stata Technical Support.
Comment

Chi-lin Tsai

Join Date: Mar 2017
Posts: 17

10 May 2024, 02:09

Dear Prof. Mullahy,

Thank you very much for your reply. I'll contact Stata Technical Support and keep everyone here updated of any new information from Stata's team (though I'm not sure whether I'm entitled to get official support, since I'm using Stata 17 rather than the current release of Stata 18. https://www.stata.com/support/tech-support/contact/) Anyway, let me add a few more points from my own experimentations for future reference.

First, the code below duplicates from post #2, but this time, I sort the 100 simulations according to the ghk results. Sorry for such a long table below, but it clearly shows that the missing/negative-value issue doesn't only occur in the most extreme cases (i.e. not only in the cases that are most close to 0); instead, the issue scatters over different levels of cases. For example, mvnormal returns missing/negative values for case 14, while ghk returns 5.9473e-130, which is indeed a rather extreme case; however, mvnormal also returns missing/negative values for case 83, while ghk returns 9.77650e-12, which is not that extreme by comparison.

Second, mvnormal also returns many zero probabilities, and again, this occurs not only in the most extreme cases but scatters around. For example, mvnormal returns zeros for case 75, but it returns proper values (i.e. non-zero, non-negative and non-missing values) for cases before (e.g. 74, 73, 72) and after (e.g. 76, 77, 78). A zero probability is theoretically fine but practically problematic as well as missing or negative probabilities for maximum likelihood estimation, because a zero probability means a missing log-likelihood, and that can cause the MLE algorithm to abort the current set of parameters and try another set that may not be really better than the current ones (because the zero-value issue does not necessarily occur in the most extreme cases but scatters around), and also which can waste computational time for iteration back and forth.

Code:

mata:
rseed(123)
N =100
U =runiform(N, 4, -20, 20) //randomly generate upper limits
L =J(1,cols(U),mindouble()) //lower limits
M =J(1,cols(U),0) //means
i =1
while (i>0) { //randomly generate a positive definite correlation matrix
    R =1, runiform(1, 3, -.5, .5),
       1, runiform(1, 2, -.5, .5),
       1, runiform(1, 1, -.5, .5), 1
    eigensystem(invvech(R'), Evec=., Eval=.)
    i =Eval[1]<0 + Eval[2]<0 + Eval[3]<0 + Eval[4]<0
}
S =ghkfast_init(N, 5000, 4, "hammersley")
P =mvnormalqp(     U,    R, 5000), //                       col 1 in the table below
   mvnormalqp(  L, U,    R, 5000), //specify lower limits;  col 2 in the table below
   mvnormalcvqp(L, U, M, R, 5000), //specify means as well; col 3 in the table below
   ghkfast(S, U, invvech(R'))      //ghk algorithm;         col 4 in the table below
end

m: sort(P,4) //sort by the ghk results (col 4)
                    1              2              3              4
      +-------------------------------------------------------------+
    1 |             0              0              0    2.7113e-222  |
    2 |             0              0              0    1.5683e-217  |
    3 |             0              0              0    5.9378e-183  |
    4 |             0              0              0    1.5414e-175  |
    5 |             0              0              0    6.1667e-168  |
    6 |             0              0              0    7.3629e-155  |
    7 |             0              0              0    4.6778e-141  |
    8 |             0              0              0    1.0507e-138  |
    9 |             0              0              0    4.0807e-138  |
   10 |             0              0              0    1.2131e-136  |
   11 |             0              0              0    1.4760e-131  |
   12 |             0              0              0    3.2130e-131  |
   13 |             0              0              0    4.5001e-131  |
   14 |             .   -6.0776e-180   -6.0776e-180    5.9473e-130  |
   15 |             0              0              0    4.1564e-116  |
   16 |             0              0              0    3.6298e-108  |
   17 |   3.76870e-95    3.76870e-95    3.76870e-95    7.6390e-108  |
   18 |             0              0              0    5.2900e-106  |
   19 |             .   -9.36494e-99   -9.36494e-99    6.5598e-106  |
   20 |             .   -1.80679e-41   -1.80679e-41    4.6323e-105  |
   21 |             0              0              0    8.9492e-103  |
   22 |             .   -1.18885e-58   -1.18885e-58    6.10336e-89  |
   23 |   3.56406e-10    3.56406e-10    3.56406e-10    6.21319e-89  |
   24 |             .   -4.73615e-61   -4.73615e-61    2.64472e-87  |
   25 |             0              0              0    4.03935e-84  |
   26 |   3.83308e-38    3.83308e-38    3.83308e-38    7.37247e-82  |
   27 |             .   -1.69346e-17   -1.69346e-17    4.25670e-78  |
   28 |             .   -1.66582e-83   -1.66582e-83    2.50771e-76  |
   29 |             0              0              0    3.81206e-76  |
   30 |             .   -3.65511e-40   -3.65511e-40    4.26119e-76  |
   31 |             0              0              0    1.56154e-73  |
   32 |             0              0              0    1.32681e-72  |
   33 |             0              0              0    1.30398e-69  |
   34 |             0              0              0    1.64851e-68  |
   35 |   3.05286e-10    3.05286e-10    3.05286e-10    1.23207e-67  |
   36 |             .   -8.80583e-61   -8.80583e-61    1.16314e-65  |
   37 |             .   -3.32988e-18   -3.32988e-18    1.87886e-64  |
   38 |             0              0              0    2.47912e-64  |
   39 |             0              0              0    4.56999e-63  |
   40 |             0              0              0    9.60716e-62  |
   41 |             .   -1.10491e-12   -1.10491e-12    8.93532e-60  |
   42 |             .   -1.44604e-43   -1.44604e-43    6.76591e-59  |
   43 |             .   -6.93790e-38   -6.93790e-38    1.49424e-58  |
   44 |   5.56589e-57    5.56589e-57    5.56589e-57    2.35608e-58  |
   45 |             .   -7.39241e-57   -7.39241e-57    2.31183e-57  |
   46 |             .   -4.91850e-45   -4.91850e-45    4.16310e-57  |
   47 |   1.03174e-52    1.03174e-52    1.03174e-52    7.65363e-57  |
   48 |             .   -2.36992e-12   -2.36992e-12    5.89594e-55  |
   49 |   1.20154e-10    1.20154e-10    1.20154e-10    4.73866e-54  |
   50 |             .   -1.02864e-10   -1.02864e-10    9.42588e-53  |
   51 |             .   -6.76242e-38   -6.76242e-38    8.55336e-52  |
   52 |   1.76705e-28    1.76705e-28    1.76705e-28    2.66006e-51  |
   53 |             .   -1.17356e-11   -1.17356e-11    1.42053e-50  |
   54 |             0              0              0    5.25266e-49  |
   55 |             .   -2.60284e-15   -2.60284e-15    1.80540e-45  |
   56 |   2.84288e-11    2.84288e-11    2.84288e-11    3.93690e-45  |
   57 |   1.71241e-36    1.71241e-36    1.71241e-36    4.63579e-42  |
   58 |             .   -2.14321e-16   -2.14321e-16    2.86704e-39  |
   59 |             .   -9.05480e-27   -9.05480e-27    1.14291e-38  |
   60 |             .   -2.39988e-36   -2.39988e-36    2.27873e-38  |
   61 |             .   -1.30611e-21   -1.30611e-21    2.48176e-38  |
   62 |             .   -1.08027e-15   -1.08027e-15    1.97691e-34  |
   63 |   9.13092e-31    9.13092e-31    9.13092e-31    5.39092e-34  |
   64 |             0              0              0    7.92395e-34  |
   65 |   6.77774e-11    6.77774e-11    6.77774e-11    1.40353e-33  |
   66 |             .   -7.31242e-25   -7.31242e-25    2.77664e-33  |
   67 |             .   -7.17486e-22   -7.17486e-22    5.83723e-31  |
   68 |   6.01498e-11    6.01498e-11    6.01498e-11    1.03228e-29  |
   69 |             .   -1.25710e-24   -1.25710e-24    6.06249e-29  |
   70 |             .   -9.31139e-11   -9.31139e-11    2.54512e-22  |
   71 |   1.06404e-13    1.06404e-13    1.06404e-13    6.89166e-22  |
   72 |   6.25664e-12    6.25664e-12    6.25664e-12    1.14354e-20  |
   73 |   1.26153e-18    1.26153e-18    1.26153e-18    1.94266e-18  |
   74 |   7.65077e-12    7.65077e-12    7.65077e-12    7.11284e-18  |
   75 |             0              0              0    8.85506e-17  |
   76 |   3.08008e-15    3.08008e-15    3.08008e-15    3.69193e-15  |
   77 |   4.47304e-15    4.47304e-15    4.47304e-15    5.09514e-15  |
   78 |   3.88722e-14    3.88722e-14    3.88722e-14    3.94943e-14  |
   79 |   1.28221e-13    1.28221e-13    1.28221e-13    1.28257e-13  |
   80 |   5.10429e-13    5.10429e-13    5.10429e-13    5.11682e-13  |
   81 |   8.41259e-13    8.41259e-13    8.41259e-13    8.65835e-13  |
   82 |             .   -9.96536e-13   -9.96536e-13    1.28863e-12  |
   83 |             .   -4.13709e-12   -4.13709e-12    9.77650e-12  |
   84 |   5.64202e-11    5.64202e-11    5.64202e-11    5.64233e-11  |
   85 |   5.50928e-08    5.50928e-08    5.50928e-08    5.49572e-08  |
   86 |   7.96023e-08    7.96023e-08    7.96023e-08    7.96026e-08  |
   87 |   1.48729e-06    1.48729e-06    1.48729e-06    1.48730e-06  |
   88 |   .0002695765    .0002695765    .0002695765    .0002695764  |
   89 |   .0006192855    .0006192855    .0006192855    .0006192853  |
   90 |   .0053727413    .0053727413    .0053727413    .0053727283  |
   91 |   .0107465779    .0107465779    .0107465779    .0107465779  |
   92 |   .0167161486    .0167161486    .0167161486    .0167161486  |
   93 |   .1135244903    .1135244903    .1135244903    .1135244903  |
   94 |   .7253684312    .7253684312    .7253684312    .7253684312  |
   95 |   .7906354932    .7906354932    .7906354932    .7906354202  |
   96 |   .9888297905    .9888297905    .9888297905    .9888297904  |
   97 |   .9999999999    .9999999999    .9999999999              1  |
   98 |             1              1              1              1  |
   99 |             1              1              1              1  |
  100 |             1              1              1              1  |
      +-------------------------------------------------------------+

Third, given all the issues above (and also given that Stata's cmmprobit uses ghk instead of mvnormal), ghk seems to be a better choice than mvnormal. The problem is that ghk is slower than mvnormal under a similar precision level. The table below shows the computational time, the number of iterations, and the log-likelihood of a MLE model of mine (which a bit complicated, so I'm not going to detail it here.) As can be seen from the table, mvnormal with the default 128 quardrature points is the fastest. As for ghk, on the one hand, it becomes slower and slower as integration points increase (by the way, ghkfast doesn't speed up the computation of my model to a practical level.) On the other hand, the more integration points, the closer between the the log-likelihood of ghk and that of mvnormal. Neither of these is surprising. What I wanna say here is that, if I opt for ghk in order avoid the missing/negative/zero probabilities, then I have to specify a very large number of integration points and wait several hours for the convergence of one single model. However, if I stick to mvnormal, I would have to face the uncertainty about the potential optimisation problems by the missing/negative/zero probabilities.

HTML Code:

+------------------------------------------------------------------------------------------+
|Time(sec)|Iter|      LL    |                                                              |
+---------+----+------------+--------------------------------------------------------------+
|  408.76 |  5 | -11414.555 |mvnormal: 128 quadrature points                               |
| 1616.76 |  4 | -11414.117 |ghk:     1000 Hammersley points; no pivot; no antithetic draw |
| 8731.66 |  4 | -11414.616 |ghk:     5000 Hammersley points; no pivot; no antithetic draw |
|17898.83 |  4 | -11414.581 |ghk:    10000 Hammersley points; no pivot; no antithetic draw |
+------------------------------------------------------------------------------------------+

Sorry for the long post. I guess that many of the issues and questions can only be addressed, if we know the underlying algorithm of mvnormal, so I'll contact Stata Technical Support. Meanwhile, remarks and advice are still very welcome here. Thank you very much.

Last edited by Chi-lin Tsai; 10 May 2024, 02:13.

Announcement