Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is -python script- much slower in Stata 17 than in Stata 16?

    I use the tuples command (from SSC). The tuples command is implemented in terms of python script and, optionally, uses Mata code.

    I run the following script:

    Code:
    cls
    about
    python query
    
    clear all
    macro drop _all
    
    numlist "1/17"
    
    timer clear
    
    // timing python script
    timer on 1
    tuples `r(numlist)'
    timer off 1
    
    macro drop _tuple*
    
    // timing mata code
    timer on 2
    tuples `r(numlist)' , nopython
    timer off 2
    
    timer list

    Here are results from Stata 16

    Code:
    Stata/IC 16.1 for Windows (64-bit x86-64)
    Revision 14 Jun 2022
    Copyright 1985-2019 StataCorp LLC
    
    Total physical memory:       16.00 GB
    Available physical memory:   10.67 GB
    
    Stata license: Single-user  perpetual
    Serial number: omitted
      Licensed to: omitted
    
    . python query
    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
        omitted
    
        Python system information
          initialized          yes
          version              3.10.0
          architecture         64-bit
          library path         omitted\Python\Python310\python310.dll
    
    omitted
    
    . timer list
       1:      2.47 /        1 =       2.4660
       2:      5.35 /        1 =       5.3530

    Python does the job in about 2.5 seconds; in half the time it takes Mata.


    Here are the results from the identical script in Stata 17:

    Code:
    . about
    
    Stata/BE 17.0 for Windows (64-bit x86-64)
    Revision 23 Aug 2022
    Copyright 1985-2021 StataCorp LLC
    
    Total physical memory:       16.00 GB
    Available physical memory:   10.65 GB
    
    Stata license: Single-user  perpetual
    Serial number: omitted
      Licensed to: omitted
    
    . python query
    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
        omitted
        
        Python system information
          initialized          yes
          version              3.10.4
          architecture         64-bit
          library path         omitted\Python\Python310\python310.dll
    
    omitted
    
    . timer list
       1:      6.54 /        1 =       6.5350
       2:      4.90 /        1 =       4.8980

    Good news: Mata is faster now. Bad news: Python is much slower.

    What is going on?
    Last edited by daniel klein; 28 Sep 2022, 06:25.

  • #2
    daniel klein I get somewhat different results on my Mac.

    This is Stata 17:

    Code:
    . timer list
       1:      2.23 /        1 =       2.2250
       2:      1.63 /        1 =       1.6280
    
    . about
    
    Stata/MP 17.0 for Mac (Intel 64-bit)
    Revision 23 Aug 2022
    Copyright 1985-2021 StataCorp LLC
    
    Total physical memory: 16.00 GB
    
    Stata license: Single-user 12-core <omitted>
    As compared to Stata 16:

    Code:
    . timer list
       1:      1.80 /        1 =       1.8000
       2:      1.51 /        1 =       1.5080
    
    .
    end of do-file
    
    . about
    
    Stata/MP 16.1 for Mac (Intel 64-bit)
    Revision 14 Jun 2022
    Copyright 1985-2019 StataCorp LLC
    
    Total physical memory: 16.00 GB
    
    Stata license: Single-user 2-core  <omitted>
    Also, the python setup:
    Code:
    . python query 
    -------------------------------------------------------------------------------------------------------------------------------
       <omitted>
        Python system information
          initialized          yes
          version              3.10.5
          architecture         64-bit
    So for me, Mata is faster in both. The bad news: Stata 16/MP 2-core out-performs Stata 17/MP 12-core!!
    Last edited by Hemanshu Kumar; 28 Sep 2022, 06:31.

    Comment


    • #3
      Regarding #1: This is very strange. Is it possible there are other background processes (or on a server, other users) that might be consuming system resources and interfering with the second test? If so, then if you repeat the test there should be variation in the total time each procedure takes. If you repeat the test and consistently get the same results, then I have no idea what's happening. I would assume you only have one python 3 environment installed? If so, then I believe the python bytecode implementation should be the same regardless of the version of Stata, correct?

      Regarding #2: I wonder if the problem size is large enough that the 12-core implementation is worth the overhead associated with starting a bunch of separate threads or possibly starting virtual threads? This also raises another question: Is -tuples- written in such a way that it is parallelizable?

      Comment


      • #4
        Originally posted by Daniel Schaefer View Post
        Regarding #1: Is it possible there are other background processes (or on a server, other users) that might be consuming system resources and interfering with the second test? If so, then if you repeat the test there should be variation in the total time each procedure takes.
        No. There are two sessions of Stata running, 16 and 17. That is it. Obviously, timings change from run to run but the puzzling result that python script appears to be slower on Stata 17 than on Stata 16 (by quite a margin) remains stable.

        Originally posted by Daniel Schaefer View Post
        I would assume you only have one python 3 environment installed? If so, then I believe the python bytecode implementation should be the same regardless of the version of Stata, correct?
        Yes, and I can confirm this via

        Code:
        python
        import sys
        print(sys.version)
        in both instances of Stata. I wonder, however, why the version reported by python query differs across the versions of Stata (3.10.0 vs. 3.10.4).

        Originally posted by Daniel Schaefer View Post
        Regarding #2: I wonder if the problem size is large enough that the 12-core implementation is worth the overhead associated with starting a bunch of separate threads or possibly starting virtual threads? This also raises another question: Is -tuples- written in such a way that it is parallelizable?
        I agree that Hemanshu Kumar should probably increase the size of the item set; I would not read too much into the difference between 1.8 and 1.5 seconds on Stata 16. However, it is striking that python script takes more than 2 seconds on Stata 17. And, no, we have not set up tuples with parallelization in mind; I have no idea what the Mata (or Python) compiler is able to do in this regard.
        Last edited by daniel klein; 28 Sep 2022, 09:47.

        Comment


        • #5
          As suggested, I changed the item set
          Code:
          numlist "1/20"
          and got perhaps more meaningful results. Unlike #1, now I get python being by far the superior method in both versions of Stata.

          In Stata 17/MP 12-core:
          Code:
          . timer list
             1:     13.98 /        1 =      13.9850
             2:     25.66 /        1 =      25.6560
          In Stata 16/MP 2-core:
          Code:
          . timer list
             1:     10.09 /        1 =      10.0900
             2:     24.00 /        1 =      23.9990
          Stata 16/MP 2-core still edges out Stata 17/MP 12-core, and by a significant margin!

          Comment


          • #6
            Originally posted by Hemanshu Kumar View Post
            Unlike #1, now I get python being by far the superior method in both versions of Stata.
            This is what I would expect.

            Originally posted by Hemanshu Kumar View Post
            Stata 16/MP 2-core still edges out Stata 17/MP 12-core, and by a significant margin!
            And this is what bothers me.

            Thanks for replicating this.

            Comment

            Working...
            X