Originally posted by charlie wong
View Post
Code:
clear all set obs 10000000 gen x = rnormal() timeit 1: gen sum = sum(x) timeit 2: sum x, d timeit 3: sum x, meanonly timer list
1: 0.52 / 1 = 0.5220
2: 14.73 / 1 = 14.7260
3: 0.11 / 1 = 0.1050
If we compare 1 and 2, we see that sum,d takes a lot longer than sum(x), which are the commands used respectively by skew() and mean/sd() and explains why performance for mean/sd was still fine, while skew was not. Furthermore, if we compare 1 and 3, we see that using sum x instead of sum(x) could potentially speed up mean/sd by a factor 5 still. Of course, this ignores some details but the contrast might actually be starker still ...
Some might say I'm obsessed with speedtests (I may or may not have a folder on my pc with a bunch of different speed comparisons...). Did you know that using egen tag = tag() followed by drop if tag == 0 is considerably faster than duplicates drop, force (~20-50%) (at least, in my stylised test). Or that drop <varlist> (multiple vars) is massively faster than a succession of drop <varname> (single var at a time)? Well, now you do.
Comment