Hi,
I have issues with my code executing slowly, and freezing at unexpected points. I also see extraordinary RAM usage at some points (though still not exhausting all the system has, so maybe not a cause for the freeze).
Is there a reasonable way to profile Stata ado files? I have two problems with -profiler-:
1. It only times commands' execution, does nothing else. E.g. you don't see memory use, unless it has an indirect effect on runtime.
2. I see no logic in what level you see the runtime disaggregated. I definitely see some commands like -twoway- in fine detail down to multiple commands it calls. On the other hand, I don't see many other commands my ado file calls, e.g. lines of -egen-. Or you can't see inside loops because those are compiled? That's a bit confusing, as programs (subroutines) are also compiled, no?
This is becoming a critical issue, as I would like to hold most of my data in memory (Stata is not that efficient in merging on certain variables only if needed, not to mention preserve-restore cycles.)
As my data can double (from 25 GB to 50 GB) even if I run simple things on only a few variables, it would be great to see where that happens. But also on runtime, I expect unpleasant surprises. E.g. I just learnt that -twoway__scatteri_serset- can run for five minutes on my data even though I would expect that immediate graphing command bears no relation to how much data there is.
Thanks,
Laszlo
I have issues with my code executing slowly, and freezing at unexpected points. I also see extraordinary RAM usage at some points (though still not exhausting all the system has, so maybe not a cause for the freeze).
Is there a reasonable way to profile Stata ado files? I have two problems with -profiler-:
1. It only times commands' execution, does nothing else. E.g. you don't see memory use, unless it has an indirect effect on runtime.
2. I see no logic in what level you see the runtime disaggregated. I definitely see some commands like -twoway- in fine detail down to multiple commands it calls. On the other hand, I don't see many other commands my ado file calls, e.g. lines of -egen-. Or you can't see inside loops because those are compiled? That's a bit confusing, as programs (subroutines) are also compiled, no?
This is becoming a critical issue, as I would like to hold most of my data in memory (Stata is not that efficient in merging on certain variables only if needed, not to mention preserve-restore cycles.)
As my data can double (from 25 GB to 50 GB) even if I run simple things on only a few variables, it would be great to see where that happens. But also on runtime, I expect unpleasant surprises. E.g. I just learnt that -twoway__scatteri_serset- can run for five minutes on my data even though I would expect that immediate graphing command bears no relation to how much data there is.
Thanks,
Laszlo

Comment