Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Estimate memory usage by a Stata command?

    Hello all,

    Has anyone ever heard of a command or some such to estimate the memory consumption by a Stata command?

    I'm working with panel data, months x US counties (~1.2 million observations, ~1.1 GB of data) and the xtreg command I'm using chugs through the data, albeit slowly.

    There have been several situations though, where my memory usage has exceeded my RAM (16GB) and bogged the system down to uselessness. I'm working with my IT people to see what other, higher-RAM capacity resources exist, but it would be really good to know how much RAM is required so we aren't involved in a game of "Let's see if this works - oops, well let's see if this works" .

    Thanks in advance for your input!

    P.S.
    (I've read that 'areg' and 'reghdfe' may be more efficient in terms of memory usage, so if anyone has experience with that, thank you in advance).

    -Michael

  • #2
    Michael, in general - there is no doc. Not even how many tempvars each command creates under the hood. And it is understandable, as there are too many code and data dependencies.
    16GB doesn't sound like a huge amount of memory: Stata business license costs as much as about 128GB at current rates.

    Check if sysinfo is of any use for your investigations: http://radyakin.org/statalist/2013080201/sysinfo.htm

    Best, Sergiy Radyakin

    Comment


    • #3
      Sergiy,

      Thanks very much for your reply. I'm sad to hear that this functionality doesn't exist, but I'll definitely check out whether sysinfo will help.

      -Michael

      Comment


      • #4
        Michael, don't be sad: it is a change for better. Recent transformations of Stata's memory manager opened tremendous opportunities for efficiency gains:
        • automatic memory management (no more set mem x, which was the first command the students had to learn);
        • discontinuous memory (probably) which allows more observations;
        • data compression (identical strLs are collated, if this is the right term for it);
        • balancing of memory between Mata and Stata for mutual benefit; (not clear what happens to plugins);
        • possibility to introduce unicode in Stata 14;
        • and perhaps just as many more less visible features.
        This means however, that the old school approach for determining the memory requirements as n*k*width (simplified) is not as relevant now as it used to be in good old Stata 10 days.
        It also means that now the same dataset may have different representations in memory or file, all looking identical to the user. See the supcompress discussion:
        http://www.statalist.org/forums/foru...f-strl-vs-strf

        Best, Sergiy Radyakin

        Comment

        Working...
        X