Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is there a memory-efficient way to create a massive band diagonal matrix in Stata/Mata, even though Mata doesn't have sparse matrices?

    I have MATLAB code that takes three long column vectors, d1, d2, and d3, and creates a band diagonal matrix from them. The code is basically this:

    Code:
    d1 = 1:5;
    d2 = 10:10:60;
    d3 = 100:100:700;
    F = diag(d1, -2) + diag(d2, -1) + diag(d3) + diag(d2, 1) + diag(d1, 2)
    y = [17 23 45 42 65 80 95]'; % random data, as a column vector
    tau = F \ y
    which produces the matrix F

    Code:
    F =
       100    10     1     0     0     0     0
        10   200    20     2     0     0     0
         1    20   300    30     3     0     0
         0     2    30   400    40     4     0
         0     0     3    40   500    50     5
         0     0     0     4    50   600    60
         0     0     0     0     5    60   700
    and the vector tau

    Code:
    tau =
             0.159379251059404
            0.0928131665695312
             0.133943228364286
            0.0823548041069989
             0.110248261684698
             0.111056176436218
             0.125407697293433

    This isn't terribly memory efficient in this form, however, because for extremely long vectors, the matrix F takes up a lot of memory. So, in MATLAB, I used sparse matrices (specifically the -spdiags- function).

    Now, I want to do something like this in Stata, but Mata doesn't have ANY sparse matrix capabilities at all, and the documentation recommends against using diag because, as in MATLAB, it's not memory-efficient.

    Is there a memory-efficient way to replicate code like this in Mata? I have this code so far:

    Code:
        mata clear
        d1 = range(1, 5, 1);
        d2 = range(10, 60, 10);
        d3 = range(100, 700, 100);
        
        T = length(d3);
        F = diag(d3)
        for (i = 2; i <= T; i++) {
            F[i, i - 1] = d2[i - 1]
            F[i - 1, i] = d2[i - 1]
        }
        for (i = 3; i <= T; i++) {
            F[i, i - 2] = d1[i - 2]
            F[i - 2, i] = d1[i - 2]
        }
        
        y = (17 \ 23 \ 45 \ 42 \ 65 \ 80 \ 95);
        tau = svsolve(F, y);
    but without sparse matrices, I don't think this is memory efficient because it's going to store the entire matrix. When dealing with hundreds of thousands of observations, this Mata algorithm will fail because it can't allocate the memory for the full F matrix. Is there any way to replicate the behavior of sparse matrices in Mata so this code will actually work?

  • #2
    Michael --

    While you are right - there are no sparse matrix capabilities in Stata - some routines use similar ideas and you might be able to piggyback on these. The "sppack" suite by Drukker, Ping, Prucha, and Raciborski I know makes use of banded matrices to do spatial analysis with large numbers of observations.

    David Drukker has slides that I've looked at talking about banded matrices. Also, the above-mentioned group has a paper that gets into the details that might help.

    Best,

    Matt Baker

    Comment


    • #3
      you can also have a look at the undocumented functions.
      see help mf_spmatbanded. Maybe you can find some functions you can use.
      Christophe

      Comment


      • #4
        Originally posted by Christophe Kolodziejczyk View Post
        you can also have a look at the undocumented functions.
        see help mf_spmatbanded. Maybe you can find some functions you can use.
        I'll have to think about how to use this; I'm not sure this is what I want, because it still requires me to have an N x N matrix (the full matrix), which, due to the large number of observations, can't be fit in memory. Are there any other categories of undocumented functions in Mata?

        Comment

        Working...
        X