Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stata 15.1 Mac memory issue

    My Stata is running * very very* slowly for some processes. (Mac Sierra 10.12.6; Stata is current update).

    With

    Code:
    sysuse auto
    memory
    Everything looks pretty normal as in the manual for -memory- except:

    [...]

    overhead 1,081,344 1,082,136

    grand total 4,535,885 72,197,839

    Any ideas about the overhead there? Is there something that -clear all- doesn't get?
    __________________________________________________ __
    Assistant Professor, Department of Biostatistics and Epidemiology
    School of Public Health and Health Sciences
    University of Massachusetts- Amherst

  • #2
    Can you give examples of what is running slowly? And what other applications do you have running at the same time?

    Comment


    • #3
      The overall and overhead usage you show is in line with my current memory use with the auto.dta and my instance is not running slowly on my Mac and Stata 15.1.
      I suspect this has something to do with the OS or the installation of Stata, not something that is fixable from changing Stata settings. I would (1) update your mac to the current version (10.13.xx) and (2) reinstall Stata to see if this helps.

      You could try changing the memory settings (maxvar, niceness, max_memory, etc) to help speed things up a bit, but I doubt I is going to fix a 'slow' instance of Stata. (The only exception I can think of is if you are using a dataset that is larger than the memory of your machine which causes Stata to use the disk as virtual memory and will definitely slow things down, but you dont mention this condition/situation)

      If I run:

      Code:
      set rmsg on, perm
      memory
      query memory
      
      
      sysuse auto, clear
      memory
      regress price mpg t* h i.r
      
      clear all
      set maxvar 2048
      set max_memory 500m
      memory
      sysuse auto, clear
      memory
      regress price mpg t* h i.r
      You can see how much memory/overhead is allocated due to changing the max var and memory max settings to lower thresholds. The initial maxvar for me in the example is 20000 (I'm using MP) and I lower it to the minimum (2048) and you can see the change in overhead below. The regression in the example does speed up when I change/lower my Stata settings, but not by much. Again, my bet is this is something else going on with your system (as William Buchanan suggests, check what other applications are running).

      Code:
      . set rmsg on, perm
      (set rmsg preference recorded)
      r; t=0.00 7:28:18
      
      . memory
      
        Memory usage
                                                  used                allocated
          ---------------------------------------------------------------------
          data                                       0               67,108,864
          strLs                                      0                        0
          ---------------------------------------------------------------------
          data & strLs                               0               67,108,864
      
          ---------------------------------------------------------------------
          data & strLs                               0               67,108,864
          var. names, %fmts, ...                     4                   65,000
          overhead                           1,081,344                1,082,136
      
          Stata matrices                             0                        0
          ado-files                              3,115                    3,115
          stored results                             0                        0
      
          Mata matrices                              0                        0
          Mata functions                             0                        0
      
          set maxvar usage                  23,701,954               23,701,954
      
          other                                  3,945                    3,945
          ---------------------------------------------------------------------
          grand total                       24,786,122               91,965,014
      r; t=0.00 7:28:18
      
      . query memory
      ------------------------------------------------------------------------------------------------------------------------
          Memory settings
            set maxvar           20000      2048-120000; max. vars allowed
            set matsize          11000      10-11000; max. # vars in models
            set niceness         5          0-10
            set min_memory       0          0-1600g
            set max_memory       .          32m-1600g or .
            set segmentsize      32m        1m-32g
      r; t=0.00 7:28:18
      
      . 
      . 
      . sysuse auto, clear
      (1978 Automobile Data)
      r; t=0.00 7:28:18
      
       
      . regress price mpg t* h i.r
      
            Source |       SS           df       MS      Number of obs   =        69
      -------------+----------------------------------   F(8, 60)        =      2.92
             Model |   161665401         8  20208175.1   Prob > F        =    0.0080
          Residual |   415131558        60   6918859.3   R-squared       =    0.2803
      -------------+----------------------------------   Adj R-squared   =    0.1843
             Total |   576796959        68  8482308.22   Root MSE        =    2630.4
      
      ------------------------------------------------------------------------------
             price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
               mpg |  -245.8753    86.2927    -2.85   0.006    -418.4864   -73.26419
             trunk |   103.7782    117.929     0.88   0.382    -132.1149    339.6713
              turn |   45.97527   122.6736     0.37   0.709    -199.4086    291.3591
          headroom |  -632.1889   512.8959    -1.23   0.223    -1658.133    393.7556
                   |
             rep78 |
                2  |   1224.583   2190.872     0.56   0.578    -3157.813    5606.979
                3  |   1669.832   2057.038     0.81   0.420    -2444.856     5784.52
                4  |   2039.639   2101.977     0.97   0.336    -2164.941     6244.22
                5  |   3356.016   2194.784     1.53   0.131    -1034.205    7746.237
                   |
             _cons |   8067.111   6310.632     1.28   0.206    -4556.032    20690.25
      ------------------------------------------------------------------------------
      r; t=0.10 7:28:18
      
      . 
      . clear all
      r; t=0.00 7:28:18
      
      . set maxvar 2048
      
      r; t=0.00 7:28:18
      
      . set max_memory 500m
      r; t=0.00 7:28:18
      
      . memory
      
        Memory usage
                                                  used                allocated
          ---------------------------------------------------------------------
          data                                       0               67,108,864
          strLs                                      0                        0
          ---------------------------------------------------------------------
          data & strLs                               0               67,108,864
          maximum allowed (set max_memory)                          524,288,000
      
          ---------------------------------------------------------------------
          data & strLs                               0               67,108,864
          var. names, %fmts, ...                     4                   65,000
          overhead                           1,081,344                1,082,136
      
          Stata matrices                             0                        0
          ado-files                                  0                        0
          stored results                             0                        0
      
          Mata matrices                              0                        0
          Mata functions                             0                        0
      
          set maxvar usage                   2,428,834                2,428,834
      
          other                                  3,945                    3,945
          ---------------------------------------------------------------------
          grand total                        3,510,523               70,688,779
      r; t=0.00 7:28:18
      
      . sysuse auto, clear
      (1978 Automobile Data)
      r; t=0.00 7:28:18
      
      . memory
      
        Memory usage
                                                  used                allocated
          ---------------------------------------------------------------------
          data                                   3,182               67,108,864
          strLs                                      0                        0
          ---------------------------------------------------------------------
          data & strLs                           3,182               67,108,864
          maximum allowed (set max_memory)                          524,288,000
      
          ---------------------------------------------------------------------
          data & strLs                           3,182               67,108,864
          var. names, %fmts, ...                 4,177                   68,029
          overhead                           1,081,344                1,082,136
      
          Stata matrices                             0                        0
          ado-files                              5,465                    5,465
          stored results                             0                        0
      
          Mata matrices                              0                        0
          Mata functions                             0                        0
      
          set maxvar usage                   2,428,834                2,428,834
      
          other                                  5,165                    5,165
          ---------------------------------------------------------------------
          grand total                        3,522,867               70,698,493
      r; t=0.00 7:28:18
      
      . regress price mpg t* h i.r
      
            Source |       SS           df       MS      Number of obs   =        69
      -------------+----------------------------------   F(8, 60)        =      2.92
             Model |   161665401         8  20208175.1   Prob > F        =    0.0080
          Residual |   415131558        60   6918859.3   R-squared       =    0.2803
      -------------+----------------------------------   Adj R-squared   =    0.1843
             Total |   576796959        68  8482308.22   Root MSE        =    2630.4
      
      ------------------------------------------------------------------------------
             price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
               mpg |  -245.8753    86.2927    -2.85   0.006    -418.4864   -73.26419
             trunk |   103.7782    117.929     0.88   0.382    -132.1149    339.6713
              turn |   45.97527   122.6736     0.37   0.709    -199.4086    291.3591
          headroom |  -632.1889   512.8959    -1.23   0.223    -1658.133    393.7556
                   |
             rep78 |
                2  |   1224.583   2190.872     0.56   0.578    -3157.813    5606.979
                3  |   1669.832   2057.038     0.81   0.420    -2444.856     5784.52
                4  |   2039.639   2101.977     0.97   0.336    -2164.941     6244.22
                5  |   3356.016   2194.784     1.53   0.131    -1034.205    7746.237
                   |
             _cons |   8067.111   6310.632     1.28   0.206    -4556.032    20690.25
      ------------------------------------------------------------------------------
      r; t=0.05 7:28:18

      Eric A. Booth | Senior Director of Research | Far Harbor | Austin TX

      Comment


      • #4
        Unless the processes running slowly use the auto dataset, I think it would be more illuminating to include the full output of the memory command run just before and just after whatever process runs more slowly than expected. Or, does that process make temporary extraordinary demands on memory? In that regard I agree very much with Eric Booth's assessment: if you have exceeded hardware memory, Stat will become less responsive. The tool to help diagnose this is Activity Monitor; look at the Memory tab when things are running slowly.

        To test this I ran the following on my 8GB system running macOS 10.12.6 and Stata/SE 15.1:
        Code:
        clear 
        set obs 10000000
        gen double x = rnormal()
        memory
        expand 10
        memory
        expand 10
        memory
        At the end, I'm looking at
        Code:
          Memory usage
                                                    used                allocated
            ---------------------------------------------------------------------
            data                           8,000,000,000           10,066,329,600
            strLs                                      0                        0
            ---------------------------------------------------------------------
            data & strLs                   8,000,000,000           10,066,329,600
        
            ---------------------------------------------------------------------
            data & strLs                   8,000,000,000           10,066,329,600
            var. names, %fmts, ...                   329                   32,829
            overhead                       2,000,718,160            2,000,718,560
        
            Stata matrices                             0                        0
            ado-files                             12,127                   12,127
            stored results                             0                        0
        
            Mata matrices                              0                        0
            Mata functions                             0                        0
        
            set maxvar usage                  38,830,849               38,830,849
        
            other                                  2,503                    2,503
            ---------------------------------------------------------------------
            grand total                   10,039,559,304           12,105,926,468
        and Activity Monitor tells me Stata is using 11.32GM memory. During the final expand, I was looking at the beachball for a while, the Memory Pressure chart in Activity Monitor was well into red, and it told me Stata was using 15GB of memory, and it indicated that Stata was unresponsive, although in fact Stata was just not paying attention to Activity Monitor. After the command completed, the Memory Pressure chart dropped back into the green, Stata was again responsive, etc.

        So my advice would be further diagnostics looking "behind the scenes" using the macOS Activity Monitor app while Stata is being slow.


        Comment


        • #5
          Very useful everyone, thanks! There has not been any consistent pattern to slow processes, even using activity monitor. I'll update the Mac OS and then see where things are.

          Edited to add results from William's test:

          Code:
           Memory usage
                                                      used                allocated
              ---------------------------------------------------------------------
              data                           8,000,000,000           12,046,041,088
              strLs                                      0                        0
              ---------------------------------------------------------------------
              data & strLs                   8,000,000,000           12,046,041,088
          
              ---------------------------------------------------------------------
              data & strLs                   8,000,000,000           12,046,041,088
              var. names, %fmts, ...                   329                   65,000
              overhead                       2,000,718,632            2,000,719,016
          
              Stata matrices                             0                        0
              ado-files                          1,271,936                1,271,936
              stored results                        13,995                   13,995
          
              Mata matrices                        134,096                  134,096
              Mata functions                       146,880                  146,880
          
              set maxvar usage                   2,428,834                2,428,834
          
              other                                162,694                  162,694
              ---------------------------------------------------------------------
              grand total                   10,004,295,880           14,050,983,539
          
          .
          end of do-file








          Last edited by Andrew Lover; 18 Dec 2017, 17:54.
          __________________________________________________ __
          Assistant Professor, Department of Biostatistics and Epidemiology
          School of Public Health and Health Sciences
          University of Massachusetts- Amherst

          Comment


          • #6
            Andrew Lover
            You can also use either the sysresources or clicmd commands from StataOS to access some of this information within Stata. Activity Monitor can be a memory/CPU hog at times, but there are some decent built-in command line utilities you can use to get a fair amount of data about what is happening in the background of the system.

            Comment


            • #7
              Also look into set matsize. Here's a post from 2014 that gives some insight as to why and how this may slow down calculations. As far as I can tell, this is still relevant as the help entries still warn about not setting matsize larger than necessary. For example, [U] 6.4 Setting Matsize:

              If you use a 32-bit computer, you likely will be unable to set matsize to 11,000. A value of 11,000
              would require nearly 1 gigabyte per matrix. The total memory consumption most 32-bit operating
              systems will grant to Stata is 2 gigabytes, so if you had two matrices, there would be no memory
              left for data or for Stata’s code!

              You should not set matsize larger than is necessary. Doing so will at best waste memory and at
              worst slow Stata down or prevent Stata from having enough memory for other tasks. If you receive
              the error message “matsize too small”, increase matsize only as much as is necessary to eliminate
              the error message.

              Comment

              Working...
              X