Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Computation time for accumulation matrixes

    In running simulations/bootstraps/etc. in Mata I often find it helpful to work with matrixes that accumulate each replication's results row-by-row. I've always had an informal sense that it is faster to (a) work with an accumulation matrix defined up front as an empty matrix J(r,c,.) (r is the # of replications) and then populate its rows replication-by-replication than (b) set up a zero-row empty matrix J(0,c,.) and then row-append to it replication-by-replication.

    How much faster is (a) than (b) I did not appreciate until running this little timing comparison. I was stunned.
    Code:
    mata
    
    timer_clear()
    
    rseed(2345)
    
    v=J(1,5,1)
    r=100000
    
    timer_on(1)
    ctch1=J(r,5,.)
    for (j=1;j<=r;j++) {
     ctch1[j,.]=v
    }
    timer_off(1)
    
    timer_on(2)
    ctch2=J(0,5,.)
    for (j=1;j<=r;j++) {
     ctch2=ctch2\v
    }
    timer_off(2)
    
    ctch1==ctch2
    
    timer()
    
    end
    Results:
    Code:
    :
    : ctch1==ctch2
      1
    
    :
    : timer()
    
    -----------------------------------------------------------------------------------------------------
    timer report
      1.       .013 /        1 =      .013
      2.       59.3 /        1 =    59.268
    -----------------------------------------------------------------------------------------------------

  • #2
    The time difference is significant. I think this is something Stata Corp. may like to respond to. My guess is that the first approach involves handling matrices in memory while the second approach involves file read and write operations. Just my guess.
    Regards
    --------------------------------------------------
    Attaullah Shah, PhD.
    Associate Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
    www.FinTechProfessor.com
    Check my asdoc program, or even better asdocx, that easily sends Stata output to MS Word

    Comment


    • #3
      A similar problem arises when you use the simulate command in Stata (not Mata). It appends a data set row by row with each new simulation result, which slows down Stata considerably when the number of replications becomes large. It helps a bit to separate the simulations in smaller chunks and then piece everything together at the end.
      https://twitter.com/Kripfganz

      Comment


      • #4
        Although I cannot speak on the details, building the matrix step by step involves checking and reallocating memory in each iteration. Setting up the matrix before populating it, allocates the required memory once. Whether that alone explains the timing differences, I cannot tell.

        Comment

        Working...
        X