Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How much time does the -meanonly- option of -summarize- save?

    Nick and I had a mini-discussion about the benefits of the -meanonly- option of summarize. I did an experiment on my 4-year old 32-bit PC, Stata SE 13.1:

    Code:
    clear
    set obs 1000000
    gen x=runiform()
    set rmsg on
    quietly summarize x
    summarize x, meanonly
    quietly summarizing a million observations took 0.08 seconds.
    With the meanonly option it took 0.05 seconds.
    I think I can do without the meanonly option.

  • #2
    I dunno. I ran your code on a 64-bit-but-older Core-2 Duo, and got .12 seconds vs .03 seconds. Did it with cases bumped up by a factor of 10, and got .69 vs .31. Bumped by a factor of 100, got 5.87 seconds vs 3.04 seconds. So 50-75% improvement, in my environment. Like many things we try to benchmark, there are so many unknown things impacting performance that it's nearly impossible to have a true comparison.

    Hmm. Ran it again (we weren't setting seeds) and got .06 vs .05 seconds when I took it back to a million cases (I was surprised your 32-bit was faster than my 64). So there is a lot of variability in what happens here. Need charts of sample size, with numerous seeds, across platforms to draw any conclusions.

    Comment


    • #3
      Every argument can be correct here.

      In Svend's example I doubt that, working interactively, anyone could type , meanonly in 0.03 s, to say nothing about the display or return list necessary to see results.

      But I see heavily-used programs in which summarize is used again and again, just for a mean or count or sum, and I hear of many people to whom a million observations is tiny. So, there are cases in writing a program where making it faster is a simple and good idea, and people here often benefit from such care, albeit unknowingly.

      Comment

      Working...
      X