Dear Statalists,
the following is a question I have come across again and again when using the collapse command: Besides the fast option, is there any way to further speed up the command?
For example, currently I am trying to collapse a data set with 50 million observations, taking simple sums of 25 indicator variables. Would (count) by faster than (sum)? Would it take less time if I was to partition the data into sets of observations (that I later append) or sets of variables (that I later merge)? If so, what would be the ideal number observations/variables/bytes per data set?
Thank you very much for your input.
Best wishes,
Milan
the following is a question I have come across again and again when using the collapse command: Besides the fast option, is there any way to further speed up the command?
For example, currently I am trying to collapse a data set with 50 million observations, taking simple sums of 25 indicator variables. Would (count) by faster than (sum)? Would it take less time if I was to partition the data into sets of observations (that I later append) or sets of variables (that I later merge)? If so, what would be the ideal number observations/variables/bytes per data set?
Thank you very much for your input.
Best wishes,
Milan
Comment