Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Substantive differences between "memory" and "size"

    What is the difference between "Size" and "Memory" in the Properties/Data tab of the main IDE window?

    I frequently work with very large datasets and like to get the dataset smaller before certain operations to improve compute speed. Which is the more relevant property for this purpose?

  • #2
    Memory is about how much memory has been allocated to Stata for use.

    Whereas size, is the size of the file that is in memory.

    Memory is a little bit of antiquated things since Stata can now use not contiguous blocks. historically it couldn't which meant you had to reserve all the memory for the analysis ahead of time.

    Shrinking data using "compress" is good for storage and read/write but doesn't really do a lot for the computational speed.

    Look a gtools for faster egens.

    Creating an index on a column and using IN is faster that IF, if you need to do many different things on the same column, but you need to create the index which has a toll.

    Look at packages like qsub, multishell, and batcher for data level parallelisation and processing.




    Comment

    Working...
    X