Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Compress and uncompress Mata binary matrix file?

    Dear all,

    I am writing a function to save a matrix in Mata to a file using Mata functions (fopen, fputmatrix) which I need to process and do some calculations later on. It is fast I think, however, the file size of the Mata matrix is very big.

    For example, my Mata matrix is 5 mil records x 100 variables, the Mata file size is about 3.72Gb. For the same number of records and variables, the Stata file size is about 700Mb. Similarly, a data of 50 mil records x 100 variables creates 40Gb in Mata size, and about 7.6Gb in Stata file.

    Is there a way to compress and uncompress the Mata matrix file without affecting the speed of reading the data back to Mata?

    Thank very much for any help.

    Minh

  • #2
    While not impossible, it would surprise me. The simple reason being that AFAIK, everything in mata is coded as a double, which takes up 8 bytes per value. Hence, 5M * 100 * 8 bytes = 4GB.
    With how easy it is to transfer things from Stata to Mata though, is it really necessary to save a mata matrix in a file?

    Comment


    • #3
      Thanks, and yes we know about the 8 bytes in Mata.
      Not necessary but for convenience we utilized that functions until we can figure out how to save a large matrix into a file and can read one column of data at a time in Mata. This is because we are doing large matrix calculations and Mata cannot handle a large matrix under a general Stata and machine environment (32-bit and 4G RAM).

      Thanks,
      Minh

      Comment

      Working...
      X