Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Computation time for accumulation matrixes

    In running simulations/bootstraps/etc. in Mata I often find it helpful to work with matrixes that accumulate each replication's results row-by-row. I've always had an informal sense that it is faster to (a) work with an accumulation matrix defined up front as an empty matrix J(r,c,.) (r is the # of replications) and then populate its rows replication-by-replication than (b) set up a zero-row empty matrix J(0,c,.) and then row-append to it replication-by-replication.

    How much faster is (a) than (b) I did not appreciate until running this little timing comparison. I was stunned.
    Code:
    mata
    
    timer_clear()
    
    rseed(2345)
    
    v=J(1,5,1)
    r=100000
    
    timer_on(1)
    ctch1=J(r,5,.)
    for (j=1;j<=r;j++) {
     ctch1[j,.]=v
    }
    timer_off(1)
    
    timer_on(2)
    ctch2=J(0,5,.)
    for (j=1;j<=r;j++) {
     ctch2=ctch2\v
    }
    timer_off(2)
    
    ctch1==ctch2
    
    timer()
    
    end
    Results:
    Code:
    :
    : ctch1==ctch2
      1
    
    :
    : timer()
    
    -----------------------------------------------------------------------------------------------------
    timer report
      1.       .013 /        1 =      .013
      2.       59.3 /        1 =    59.268
    -----------------------------------------------------------------------------------------------------

  • #2
    The time difference is significant. I think this is something Stata Corp. may like to respond to. My guess is that the first approach involves handling matrices in memory while the second approach involves file read and write operations. Just my guess.
    Regards
    --------------------------------------------------
    Attaullah Shah, PhD.
    Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
    FinTechProfessor.com
    https://asdocx.com
    Check out my asdoc program, which sends outputs to MS Word.
    For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

    Comment


    • #3
      A similar problem arises when you use the simulate command in Stata (not Mata). It appends a data set row by row with each new simulation result, which slows down Stata considerably when the number of replications becomes large. It helps a bit to separate the simulations in smaller chunks and then piece everything together at the end.
      https://twitter.com/Kripfganz

      Comment


      • #4
        Although I cannot speak on the details, building the matrix step by step involves checking and reallocating memory in each iteration. Setting up the matrix before populating it, allocates the required memory once. Whether that alone explains the timing differences, I cannot tell.

        Comment


        • #5
          So John, based on your findings that, setting up the matrix before populating it is much faster, how would I translate that to append several matrices.

          Folows my naive attempt:

          Code:
          mata clear
          
          pointers=J(100,1,NULL)
          
          for (i=1; i<=100; i++) {
          pointers[i,1]=&J(10,10,i)
          }
          
          rows=J(length(pointers),1,0)
          
          for (i=1; i<=length(pointers); i++) {
          rows[i,1]=rows(*pointers[i])
          }
          
          : rows'
                   1     2     3     4     5     6     7     8     9    10    11    12    13    14    15    16    17    18    19    20    21    22    23    24    25    26    27
              +-------------------------------------------------------------------------------------------------------------------------------------------------------------------
            1 |   10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10
              +-------------------------------------------------------------------------------------------------------------------------------------------------------------------
                  28    29    30    31    32    33    34    35    36    37    38    39    40    41    42    43    44    45    46    47    48    49    50    51    52    53    54
               -------------------------------------------------------------------------------------------------------------------------------------------------------------------
            1     10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10
               -------------------------------------------------------------------------------------------------------------------------------------------------------------------
                  55    56    57    58    59    60    61    62    63    64    65    66    67    68    69    70    71    72    73    74    75    76    77    78    79    80    81
               -------------------------------------------------------------------------------------------------------------------------------------------------------------------
            1     10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10
               -------------------------------------------------------------------------------------------------------------------------------------------------------------------
                  82    83    84    85    86    87    88    89    90    91    92    93    94    95    96    97    98    99   100
               -------------------------------------------------------------------------------------------------------------------+
            1     10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10    10  |
               -------------------------------------------------------------------------------------------------------------------+
          
          BigOne=J(sum(rows),10,0)
          
          incr=1
          
          for (i=1; i<=length(pointers); i++) {
          BigOne(|incr,1\rows[i],10|)=*pointers[i] <---- <istmt>:  3301  subscript invalid
          incr=incr+rows[i]
          }
          Thanks,

          Comment


          • #6
            Luis: I set up the final i-loop indexing of BigOne a bit differently. Here's a smaller version of your example that seems to work.
            Code:
            mata
            
            dimmat=3
            dimp=5
            
            pointers=J(dimp,1,NULL)
            
            for (i=1; i<=dimp; i++) {
            pointers[i,1]=&J(dimmat,dimmat,i)
            }
            
            rows=J(length(pointers),1,0)
            
            dim(rows)
            
            for (i=1; i<=length(pointers); i++) {
            rows[i,1]=rows(*pointers[i])
            }
            
            BigOne=J(sum(rows),dimmat,0)
            
            for (i=1; i<=length(pointers); i++) {
            BigOne[(1+(i-1)*dimmat)..(i*dimmat),1..dimmat]=*pointers[i]
            }
            
            BigOne
            
            end
            You can also do this directly, without pointers, although in general pointers may offer a nice flexible way to proceed.
            Code:
            mata
            
            dimmat=3
            dimp=5
            
            BigOne=J(dimmat*dimp,dimmat,.)
            
            for (i=1; i<=dimp; i++) {
             BigOne[(1+(i-1)*dimmat)..(i*dimmat),1..dimmat]=J(dimmat,dimmat,i)
            }
            
            BigOne
            
            end

            Comment


            • #7
              Thanks John,

              for the sake of the post original's purpose, "setting up the matrix before populating" X "accumulate each replication's results matrix-by-matrix", follows the timing comparison :

              Code:
              : mata clear
              
              : dimmat=1000
              
              : dimp=500
              
              : timer_on(1)
              
              : BigOne=J(dimmat*dimp,dimmat,.)
              
              : for (i=1; i<=dimp; i++) {
              >  BigOne[(1+(i-1)*dimmat)..(i*dimmat),1..dimmat]=J(dimmat,dimmat,i)
              > }
              
              : timer_off(1)
              
              : timer_on(2)
              
              : BigOne=J(0,dimmat,.)
              
              : for (i=1; i<=dimp; i++) {
              >  BigOne=BigOne\J(dimmat,dimmat,i)
              > }
              
              : timer_off(2)
              
              : timer()
              
              ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
              timer report
                1.       12.1 /        4 =   3.02075
                2.        363 /        4 =  90.70225
              ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

              Comment


              • #8
                wait ... forgot to reset the timer

                Code:
                : mata clear
                
                : timer_clear()
                
                : dimmat=1000
                
                : dimp=500
                
                : timer_on(1)
                
                : BigOne=J(dimmat*dimp,dimmat,.)
                
                : for (i=1; i<=dimp; i++) {
                >  BigOne[(1+(i-1)*dimmat)..(i*dimmat),1..dimmat]=J(dimmat,dimmat,i)
                > }
                
                : timer_off(1)
                
                : mata clear
                
                : dimmat=1000
                
                : dimp=500
                
                : timer_on(2)
                
                : BigOne=J(0,dimmat,.)
                
                : for (i=1; i<=dimp; i++) {
                >  BigOne=BigOne\J(dimmat,dimmat,i)
                > }
                
                : timer_off(2)
                
                : timer()
                
                ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                timer report
                  1.       5.19 /        1 =     5.186
                  2.        679 /        1 =   679.039
                ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                even more impressive.

                Comment

                Working...
                X