Hi all,
I have a 22.56 GB .csv file which I have imported in STATA (after waiting a lot of time). I now would like to compress it and to reduce its memory usage.
For the moment, after the import I just used compress command naively like this:
but it is taking forever (it's the 3rd day in a row that stata is compressing strings with the output:
Now my question is whether there is a more rapid and possibly efficient way to compress the database (e.g. should I use recast first?). Please, notice that the .csv file comes from a pandas database that I managed to compress to 8GB. Probably when converting to .csv one of the formats messed up and this is why the .csv file ended up having 22.56 GB of memory usage.
Thank you,
Federico
I have a 22.56 GB .csv file which I have imported in STATA (after waiting a lot of time). I now would like to compress it and to reduce its memory usage.
For the moment, after the import I just used compress command naively like this:
Code:
import delimited "/Users/federiconutarelli/Dropbox/PRIN Green Nutarelli/dataset/df_emakg.csv", clear *** SAVING MEMORY STEPS: compress //thi is to reduce the size of the database (it takes forever).
Code:
is strL now coalesced
Thank you,
Federico
Comment