Dear all
The -batcher- command is now finally mature enough to announce it here. It is a very low-level command to parallelise tasks. It works as follows:
exampleDofile
batcher code
For very large tasks on small computers, you might want to limit the number of parallel sessions:
batcher code
I am very grateful to JanDitzen Sergiy Radyakin Oscar Ozfidan Guglielmo Ventura and of course Kit Baum for their assistance, comments and bug reports.
The -batcher- command is now finally mature enough to announce it here. It is a very low-level command to parallelise tasks. It works as follows:
- You code your dofiles in such a way that they accept an iteration argument that decides what gets executed (this is very easy).
- You tell batcher which iterations it should supply to the dofile (this is simple numlist option).
- Batcher then starts Stata instances to run those iterations (option to limit simultaneous sessions)
- There's a tracker to keep you up to date on what has worked and what hasn't
- It is integrated with sendtoslack if you want to get updates on your smartphone or some other device
- Separate logfiles are generated for each iteration
exampleDofile
Code:
di `1'
Code:
batcher path_to_exampleDofile, i(1/4) tempfolder("C:/temp")
For very large tasks on small computers, you might want to limit the number of parallel sessions:
batcher code
Code:
batcher path_to_exampleDofile, i(1/20) tempfolder("C:/temp") maxparallel(4)
- Batcher differs from -parallel- because if I understand correctly, -parellel- works by running the same command on different slices of your dataset, whereas batcher allows any kind of differentation.
- If you require more functionality or want to work on a real cluster rather than your own PC, consider JanDitzen -multishell-
I am very grateful to JanDitzen Sergiy Radyakin Oscar Ozfidan Guglielmo Ventura and of course Kit Baum for their assistance, comments and bug reports.
Comment