Thanks to Kit Baum, a new program called randomtag is now available from SSC.
randomtag draws random observations without replacement and creates an indicator variable that tags observations in the random sample.
randomtag uses the same recipe as Stata's sample command to draw random samples but manages to do so without sorting or otherwise changing the data in memory. Given the same seed, randomtag and sample will produce the same sample.
Because the data in memory is not sorted, randomtag is significantly faster than Stata's sample command and comparable in performance to Andrew Maurer's recent fastsample command, also available from SSC.
Unlike sample and fastsample, randomtag does not discard observations. Here's a quick example:
randomtag is a stand-alone version of the code that was developed to quickly draw a random random sample for listsome (from SSC).
randomtag requires Stata version 9 or newer.
randomtag draws random observations without replacement and creates an indicator variable that tags observations in the random sample.
randomtag uses the same recipe as Stata's sample command to draw random samples but manages to do so without sorting or otherwise changing the data in memory. Given the same seed, randomtag and sample will produce the same sample.
Because the data in memory is not sorted, randomtag is significantly faster than Stata's sample command and comparable in performance to Andrew Maurer's recent fastsample command, also available from SSC.
Unlike sample and fastsample, randomtag does not discard observations. Here's a quick example:
Code:
. sysuse gnp96.dta, clear . set seed 12345 . randomtag , count(5) gen(t) . list if t +---------------------+ | date gnp96 t | |---------------------| 25. | 1973q1 4439.6 1 | 40. | 1976q4 4720.7 1 | 55. | 1980q3 5179.2 1 | 64. | 1982q4 5221.7 1 | 76. | 1985q4 5996.7 1 | +---------------------+ . set seed 12345 . sample 5, count (137 observations deleted) . sort date . list +---------------------+ | date gnp96 t | |---------------------| 1. | 1973q1 4439.6 1 | 2. | 1976q4 4720.7 1 | 3. | 1980q3 5179.2 1 | 4. | 1982q4 5221.7 1 | 5. | 1985q4 5996.7 1 | +---------------------+
randomtag requires Stata version 9 or newer.
Comment