Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Obviously, a 300k by 300k matrix exceeds memory limits, so the matrix multiplication in #15 isn't feasible. However, we can set this up as a blocked matrix multiplication (thanks for the tip, ChatGPT):
    Code:
    putmata X = (ins*)
    
    mata :
    
    N = rows(X)
    
    n_competitors = J(N,1,0)
    
    blocksize = 1000
    
    Xt = X'
    
    for (from=1; from<=N; from=from+blocksize) {
        
        to = min((from+blocksize-1,N))
        
        xt = Xt[,from..to]
        
        n_competitors[from::to] = (colsum((X*xt):>0):-1)'
        
    }
    
    n_competitors = rowmax((n_competitors,J(N,1,0)))
    
    end
    
    getmata n_competitors = n_competitors
    Now, the time complexity is still \(O\left(n^2\right)\). But, on my machine, that code handles the example data in #12 instantly, as opposed to the 2 seconds the original code in #12 takes. Expanding from 1,000 to 5,000 observations, the blocked matrix multiplication still runs in less than a second, whereas the slightly revised code in #13 takes almost 10 seconds, and the original code takes more than half a minute. I timed the blocked matrix multiplication somewhere between 1.5 and 4 minutes, depending on the block size, for 50,000 observations (I didn't time any of the other approaches on that dataset). I assume the 300k problem will still take much longer, but at least it might be feasible. Otherwise, someone should look into parallelization.
    Last edited by daniel klein; 17 Jul 2025, 03:31.

    Comment


    • #17
      (Edit: Whoops -- I posted this before I saw Daniel's posting at #16, which covers similar ground. I *think* my simple approach below works, but perhaps I'm wrong.)

      Daniel's first suggestion at #15, with a small modification, made it possible on my machine to solve the original problem (N = 300k firms and 105 sectors) in about 1/2 hour!

      The modification involves adapting to the fact that (X*X') requires 8 * (N^2) bytes of memory in Mata, clearly impractical for N = 300e3. To deal with that, one can successively compare the whole list of firms to a subset of its peers, and add up the number of competitors along the way. That is, one counts up the competitors the N firms have in (say) the first 1000 others, the second 1000 others, ..., last 1000. This approach is still O(N^2) in time, but it's apparently very efficient in Mata, requiring less than 1.0 sec. on the N = 5,000 example cited above.

      Code:
      putmata X = (serv*)
      mata:
      interval = 1000
      N = rows(X)
      n_competitors = J(N, 1, 0)
      start = 1
      while (start <= N) {
         stop = start + interval -1
         if (stop > N) stop = N
         n_competitors = n_competitors + rowsum((X * X[start..stop, .]'):>0):-1
         start = stop + 1
      }
      end
      getmata n_competitors= n_competitors
      Last edited by Mike Lacy; 17 Jul 2025, 09:23.

      Comment


      • #18
        Mike Lacy's code in #17 is almost there - and it's simpler and slightly faster than my code in #16.

        The one problem with the code in #17 is that it subtracts 1 ("self") for every interval/slice/block when it should be subtracted once per row. Here's the revised code
        Code:
        putmata X = (serv*)
        mata:
        interval = 1000
        N = rows(X)
        n_competitors = J(N, 1, 0)
        start = 1
        while (start <= N) {
           stop = start + interval -1
           if (stop > N) stop = N
           n_competitors = n_competitors + rowsum((X * X[start..stop, .]'):>0)
           start = stop + 1
        }
        n_competitors = rowmax((n_competitors:-1,J(N,1,0)))
        end
        getmata n_competitors= n_competitors
        Last edited by daniel klein; 17 Jul 2025, 14:09.

        Comment


        • #19
          (Edit: Whoops -- I posted this before I saw your last posts. We'll now work on the new codes and let you know ASAP)

          Dear all,

          Thank you very much for the extensive and detailed help!
          We ran Daniel's last code on a "mock dataset" we created to mimic the one we will use. It runs in about 2 hours (with a 24GB machine), which makes it feasible in the context where we will work. The solution of using the Mata environment and working with matrices is definitely the most efficient for our purposes (we'll keep it in mind also for future work!). Thanks a lot to the stimuli from everyone! Mike Lacy Andrew Musau daniel klein Clyde Schechter
          Last edited by Giacomo Buzzao; 18 Jul 2025, 04:06.

          Comment


          • #20
            Here I am again,

            We tried the new approach you suggested (in #18) on our mock data, and it works wonderfully: 15 minutes on my same machine (Ram is 24 GB). We are grateful for your help, thanks (also on behalf of my coauthors)!!!
            Last edited by Giacomo Buzzao; 18 Jul 2025, 06:42.

            Comment

            Working...
            X