How to compute the nearest positive definite matrix in Stata

Tiago Pereira

Join Date: Jan 2016
Posts: 405

How to compute the nearest positive definite matrix in Stata

19 Jun 2023, 20:04

Dear Statalisters,

Here is my problem:

[CODE]

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input double(y1 y2 y3 V11 V22 V33 V12 V13 V23)
.026383349697260604  .12976520542009573  .24293224436300737 .0008748842993677882 .0005412437445877245 .0020661462350592126 .00030152317645533867 .00020717540851181314 .0002147076486284315
 .16672760556617316 -.03071545716172275 -.15278090694110302 .0026170399755727314 .0017630378967933304  .004162395980374229  .0007982445099144358  .0006579982281102412 .0006783226262312874
 .18870618718253707  .07827002832983285  -.5083775696283495  .009003455824558866  .006674393764124837  .015503895865096649  .0028066743454998695  .0021764046251487243 .0026094379132329732
-.24861908809077787   .1958184287474064 -.06210259287116862  .024355964092499592  .007780704202271271   .01752038893093712  .0032648982435753175  .0032889470632104746 .0034581980175482888
end

Code:

meta mvregress y1 y2 y3, wcovvariables(V11 V12 V22 V33 V13 V23)

covariance matrix not positive definite
Within-study covariance matrix in study 1 defined by variables V11 V12 V22 V33 V13
V23 not positive definite

Are you aware of any implementation of algorithms to compute the nearest positive definite matrix (to an approximate one) in Stata? Any tips/suggestions are very much appreciated.

All the best,

Tiago

Tags: None

Joseph Coveney

Join Date: Apr 2014

Posts: 4449
#2

19 Jun 2023, 22:24

You could try something like the following. A bit cheating, but it runs.

.ÿ
.ÿversionÿ18.0

.ÿ
.ÿclearÿ*

.ÿ
.ÿquietlyÿinputÿdouble(y1ÿy2ÿy3ÿV11ÿV22ÿV33ÿV12ÿV13ÿV23)

.ÿ
.ÿ*
.ÿ*ÿBeginÿhere
.ÿ*
.ÿforvaluesÿiÿ=ÿ1/6ÿ{
ÿÿ2.ÿÿÿÿÿÿÿÿÿquietlyÿgenerateÿdoubleÿC`i'ÿ=ÿ.
ÿÿ3.ÿ}

.ÿ
.ÿmkmatÿV??,ÿmatrix(Cov)

.ÿ
.ÿforvaluesÿrowÿ=ÿ1/4ÿ{
ÿÿ2.ÿÿÿÿÿÿÿÿÿmatrixÿdefineÿCov`row'ÿ=ÿCov[`row',ÿ1..6]
ÿÿ3.ÿÿÿÿÿÿÿÿÿquietlyÿfactormatÿCov`row',ÿn(100)ÿshape(upper)ÿnames(aÿbÿc)ÿforcepsd
ÿÿ4.ÿÿÿÿÿÿÿÿÿmatrixÿdefineÿDÿ=ÿdiag(e(sds))
ÿÿ5.ÿÿÿÿÿÿÿÿÿmatrixÿdefineÿCov`row'ÿ=ÿDÿ*ÿe(C)ÿ*ÿD
ÿÿ6.ÿÿÿÿÿÿÿÿÿquietlyÿ{
ÿÿ7.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿreplaceÿC1ÿ=ÿCov`row'[1,ÿ1]ÿinÿ`row'
ÿÿ8.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿreplaceÿC2ÿ=ÿCov`row'[1,ÿ2]ÿinÿ`row'
ÿÿ9.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿreplaceÿC3ÿ=ÿCov`row'[1,ÿ3]ÿinÿ`row'
ÿ10.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿreplaceÿC4ÿ=ÿCov`row'[2,ÿ2]ÿinÿ`row'
ÿ11.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿreplaceÿC5ÿ=ÿCov`row'[2,ÿ3]ÿinÿ`row'
ÿ12.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿreplaceÿC6ÿ=ÿCov`row'[3,ÿ3]ÿinÿ`row'
ÿ13.ÿÿÿÿÿÿÿÿÿ}
ÿ14.ÿ}

.ÿ
.ÿmetaÿmvregressÿy1ÿy2ÿy3,ÿwcovvariables(C1ÿC2ÿC3ÿC4ÿC5ÿC6)ÿnolog

Multivariateÿrandom-effectsÿmeta-analysisÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿÿ12
Method:ÿREMLÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿstudiesÿ=ÿÿÿÿÿÿÿÿÿÿ4
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿObsÿperÿstudy:
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿminÿ=ÿÿÿÿÿÿÿÿÿÿ3
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿavgÿ=ÿÿÿÿÿÿÿÿ3.0
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmaxÿ=ÿÿÿÿÿÿÿÿÿÿ3
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿWaldÿchi2(0)ÿÿÿÿÿÿ=ÿÿÿÿÿÿÿÿÿÿ.
Logÿrestricted-likelihoodÿ=ÿ2.3608801ÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿÿÿÿÿÿÿ=ÿÿÿÿÿÿÿÿÿÿ.

------------------------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿCoefficientÿÿStd.ÿerr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
-------------+----------------------------------------------------------------
y1ÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ.0607938ÿÿÿ.0716534ÿÿÿÿÿ0.85ÿÿÿ0.396ÿÿÿÿ-.0796442ÿÿÿÿ.2012319
-------------+----------------------------------------------------------------
y2ÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ.0971689ÿÿÿ.0567609ÿÿÿÿÿ1.71ÿÿÿ0.087ÿÿÿÿ-.0140804ÿÿÿÿ.2084182
-------------+----------------------------------------------------------------
y3ÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿ_consÿ|ÿÿ-.1066196ÿÿÿ.1673961ÿÿÿÿ-0.64ÿÿÿ0.524ÿÿÿÿ-.4347099ÿÿÿÿ.2214707
------------------------------------------------------------------------------
Testÿofÿhomogeneity:ÿQ_Mÿ=ÿchi2(9)ÿ=ÿ128.11ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿQ_Mÿ=ÿ0.0000

-----------------------------------------
ÿÿÿRandom-effectsÿparametersÿ|ÿÿÿEstimate
-----------------------------+-----------
Unstructured:ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿsd(y1)ÿ|ÿÿÿÿ.128248
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿsd(y2)ÿ|ÿÿÿ.1091635
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿsd(y3)ÿ|ÿÿÿ.3313586
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿcorr(y1,y2)ÿ|ÿÿ-.9956929
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿcorr(y1,y3)ÿ|ÿÿ-.5586137
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿcorr(y2,y3)ÿ|ÿÿÿ.4793216
-----------------------------------------

.ÿ
.ÿexit

endÿofÿdo-file

.

I've attached the do-file if you want to check the arithmetic.
Attached Files

forcepsd.do (1.5 KB, 1 view)
Comment
Tiago Pereira

Join Date: Jan 2016

Posts: 405
#3

20 Jun 2023, 09:07

As usual, thank you so much, Joseph, for your insightful suggestions. I will check this approach, but somehow feel it will change the covariance structure substantially, no?
All the best,
Tiago
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4449
#4

20 Jun 2023, 21:25

Originally posted by Tiago Pereira View Post

I will check this approach, but somehow feel it will change the covariance structure substantially, no?

Well, yeah, to the extent that the correlation matrixes aren't positive semidefinite, I would expect that they would be changed.

A couple of years ago, in the context of updating the user-written ovbd (an on-again, off-again project), I wrote a Mata implementation of Higham's algorithm for forming the nearest positive-definite approximation to a square matrix. I think that it was just a port of an R implementation, which itself was a port of a Matlab implementation (or vice versa), which in turn was just a rote implementation of the algorithm.

I tried that on your dataset, but the resulting fixed-up covariance matrixes are a little too edgy, I guess.

Here's the output:

.ÿ
.ÿversionÿ18.0

.ÿ
.ÿclearÿ*

.ÿ
.ÿquietlyÿinputÿdouble(y1ÿy2ÿy3ÿV11ÿV22ÿV33ÿV12ÿV13ÿV23)

.ÿ
.ÿ*
.ÿ*ÿBeginÿhere
.ÿ*
.ÿrunÿnearestPSD1.do

.ÿ
.ÿlocalÿline_sizeÿ`c(linesize)'

.ÿsetÿlinesizeÿ80

.ÿ
.ÿmata:
-------------------------------------------------ÿmataÿ(typeÿendÿtoÿexit)ÿ------
:ÿ
:ÿ//ÿCreateÿnewÿvariablesÿtoÿholdÿnewÿcovarianceÿmatrixÿelements
:ÿIndexesÿ=ÿst_addvar("double",ÿ"c"ÿ:+ÿstrofreal(1..6))

:ÿ
:ÿ//ÿCreateÿaÿviewÿonÿcurrentÿcovarianceÿmatrixÿelements
:ÿst_view(Data=(.),ÿ.,ÿ"V11ÿV22ÿV33ÿV12ÿV13ÿV23")

:ÿ
:ÿ//ÿHigham'sÿalgorithm,ÿrow-by-row
:ÿforÿ(row=1;ÿrow<=rows(Data);ÿrow++)ÿ{
>ÿ
>ÿÿÿÿÿÿÿÿÿst_store(row,ÿIndexes,ÿvech(nearestPSD(invvech(Data[row,ÿ.]')))')
>ÿ}

:ÿ
:ÿend
--------------------------------------------------------------------------------

.ÿ
.ÿsetÿlinesizeÿ`line_size'

.ÿ
.ÿmetaÿmvregressÿy1ÿy2ÿy3,ÿwcovvariables(c1ÿc2ÿc3ÿc4ÿc5ÿc6)ÿnolog
note:ÿ_newX2ÿomittedÿbecauseÿofÿcollinearity.
note:ÿ_newX3ÿomittedÿbecauseÿofÿcollinearity.
_newY collinear with _newX1
r(459);

endÿofÿdo-file

r(459);

.

Again, I've attached the do-file for the Mata implementation of Higham's algorithm (nearestPSD1.do; it's a couple of years old and I haven't looked at it since and can't remember how much I tested it) and the do-file for the output above (Using the Higham method.do), in case you want to check my code (it's a bit dense, sorry) or to pursue this line of attack further.
Attached Files

nearestPSD1.do (950 Bytes, 1 view)

Using the Higham method.do (1.4 KB, 1 view)
1 like
Comment
Tiago Pereira

Join Date: Jan 2016

Posts: 405
#5

23 Jun 2023, 09:20

Thank you so much, Joseph. This is a valuable tip!
I will try it in other similar cases we have.

Thank you so much again!

Tiago
Comment

Announcement

How to compute the nearest positive definite matrix in Stata

Comment

Comment

Comment

Comment