Is there a memory-efficient way to create a massive band diagonal matrix in Stata/Mata, even though Mata doesn't have sparse matrices?

Michael Anbar

Join Date: Aug 2014

Posts: 116
#1

Is there a memory-efficient way to create a massive band diagonal matrix in Stata/Mata, even though Mata doesn't have sparse matrices?

23 Sep 2014, 17:27

I have MATLAB code that takes three long column vectors, d1, d2, and d3, and creates a band diagonal matrix from them. The code is basically this:

Code:

d1 = 1:5; d2 = 10:10:60; d3 = 100:100:700; F = diag(d1, -2) + diag(d2, -1) + diag(d3) + diag(d2, 1) + diag(d1, 2) y = [17 23 45 42 65 80 95]'; % random data, as a column vector tau = F \ y

which produces the matrix F

Code:

F = 100 10 1 0 0 0 0 10 200 20 2 0 0 0 1 20 300 30 3 0 0 0 2 30 400 40 4 0 0 0 3 40 500 50 5 0 0 0 4 50 600 60 0 0 0 0 5 60 700

and the vector tau

Code:

tau = 0.159379251059404 0.0928131665695312 0.133943228364286 0.0823548041069989 0.110248261684698 0.111056176436218 0.125407697293433

This isn't terribly memory efficient in this form, however, because for extremely long vectors, the matrix F takes up a lot of memory. So, in MATLAB, I used sparse matrices (specifically the -spdiags- function).

Now, I want to do something like this in Stata, but Mata doesn't have ANY sparse matrix capabilities at all, and the documentation recommends against using diag because, as in MATLAB, it's not memory-efficient.

Is there a memory-efficient way to replicate code like this in Mata? I have this code so far:

Code:

mata clear d1 = range(1, 5, 1); d2 = range(10, 60, 10); d3 = range(100, 700, 100); T = length(d3); F = diag(d3) for (i = 2; i <= T; i++) { F[i, i - 1] = d2[i - 1] F[i - 1, i] = d2[i - 1] } for (i = 3; i <= T; i++) { F[i, i - 2] = d1[i - 2] F[i - 2, i] = d1[i - 2] } y = (17 \ 23 \ 45 \ 42 \ 65 \ 80 \ 95); tau = svsolve(F, y);

but without sparse matrices, I don't think this is memory efficient because it's going to store the entire matrix. When dealing with hundreds of thousands of observations, this Mata algorithm will fail because it can't allocate the memory for the full F matrix. Is there any way to replicate the behavior of sparse matrices in Mata so this code will actually work?
Tags: None
Matthew J. Baker

Join Date: Mar 2014

Posts: 126
#2

24 Sep 2014, 07:44

Michael --

While you are right - there are no sparse matrix capabilities in Stata - some routines use similar ideas and you might be able to piggyback on these. The "sppack" suite by Drukker, Ping, Prucha, and Raciborski I know makes use of banded matrices to do spatial analysis with large numbers of observations.

David Drukker has slides that I've looked at talking about banded matrices. Also, the above-mentioned group has a paper that gets into the details that might help.

Best,

Matt Baker
Comment
Christophe Kolodziejczyk

Join Date: Mar 2014

Posts: 377
#3

24 Sep 2014, 13:36

you can also have a look at the undocumented functions.
see help mf_spmatbanded. Maybe you can find some functions you can use.
Christophe
Comment
Michael Anbar

Join Date: Aug 2014

Posts: 116
#4

24 Sep 2014, 15:35

Originally posted by Christophe Kolodziejczyk View Post

you can also have a look at the undocumented functions.
see help mf_spmatbanded. Maybe you can find some functions you can use.

I'll have to think about how to use this; I'm not sure this is what I want, because it still requires me to have an N x N matrix (the full matrix), which, due to the large number of observations, can't be fit in memory. Are there any other categories of undocumented functions in Mata?
Comment

Announcement

Is there a memory-efficient way to create a massive band diagonal matrix in Stata/Mata, even though Mata doesn't have sparse matrices?

Comment

Comment

Comment