Is -python script- much slower in Stata 17 than in Stata 16?

daniel klein

Join Date: Mar 2014
Posts: 3859

Is -python script- much slower in Stata 17 than in Stata 16?

28 Sep 2022, 06:15

I use the tuples command (from SSC). The tuples command is implemented in terms of python script and, optionally, uses Mata code.

I run the following script:

Code:

cls
about
python query

clear all
macro drop _all

numlist "1/17"

timer clear

// timing python script
timer on 1
tuples `r(numlist)'
timer off 1

macro drop _tuple*

// timing mata code
timer on 2
tuples `r(numlist)' , nopython
timer off 2

timer list

Here are results from Stata 16

Code:

Stata/IC 16.1 for Windows (64-bit x86-64)
Revision 14 Jun 2022
Copyright 1985-2019 StataCorp LLC

Total physical memory:       16.00 GB
Available physical memory:   10.67 GB

Stata license: Single-user  perpetual
Serial number: omitted
  Licensed to: omitted

. python query
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    omitted

    Python system information
      initialized          yes
      version              3.10.0
      architecture         64-bit
      library path         omitted\Python\Python310\python310.dll

omitted

. timer list
   1:      2.47 /        1 =       2.4660
   2:      5.35 /        1 =       5.3530

Python does the job in about 2.5 seconds; in half the time it takes Mata.

Here are the results from the identical script in Stata 17:

Code:

. about

Stata/BE 17.0 for Windows (64-bit x86-64)
Revision 23 Aug 2022
Copyright 1985-2021 StataCorp LLC

Total physical memory:       16.00 GB
Available physical memory:   10.65 GB

Stata license: Single-user  perpetual
Serial number: omitted
  Licensed to: omitted

. python query
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    omitted
    
    Python system information
      initialized          yes
      version              3.10.4
      architecture         64-bit
      library path         omitted\Python\Python310\python310.dll

omitted

. timer list
   1:      6.54 /        1 =       6.5350
   2:      4.90 /        1 =       4.8980

Good news: Mata is faster now. Bad news: Python is much slower.

What is going on?

Last edited by daniel klein; 28 Sep 2022, 06:25.

Tags: mata, python

Hemanshu Kumar

Join Date: Mar 2015
Posts: 1409

28 Sep 2022, 06:23

daniel klein I get somewhat different results on my Mac.

This is Stata 17:

Code:

. timer list
   1:      2.23 /        1 =       2.2250
   2:      1.63 /        1 =       1.6280

. about

Stata/MP 17.0 for Mac (Intel 64-bit)
Revision 23 Aug 2022
Copyright 1985-2021 StataCorp LLC

Total physical memory: 16.00 GB

Stata license: Single-user 12-core <omitted>

As compared to Stata 16:

Code:

. timer list
   1:      1.80 /        1 =       1.8000
   2:      1.51 /        1 =       1.5080

.
end of do-file

. about

Stata/MP 16.1 for Mac (Intel 64-bit)
Revision 14 Jun 2022
Copyright 1985-2019 StataCorp LLC

Total physical memory: 16.00 GB

Stata license: Single-user 2-core  <omitted>

Also, the python setup:

Code:

. python query 
-------------------------------------------------------------------------------------------------------------------------------
   <omitted>
    Python system information
      initialized          yes
      version              3.10.5
      architecture         64-bit

So for me, Mata is faster in both. The bad news: Stata 16/MP 2-core out-performs Stata 17/MP 12-core!!

Last edited by Hemanshu Kumar; 28 Sep 2022, 06:31.

Comment

Daniel Schaefer

Join Date: Mar 2020

Posts: 814
#3

28 Sep 2022, 09:06

Regarding #1: This is very strange. Is it possible there are other background processes (or on a server, other users) that might be consuming system resources and interfering with the second test? If so, then if you repeat the test there should be variation in the total time each procedure takes. If you repeat the test and consistently get the same results, then I have no idea what's happening. I would assume you only have one python 3 environment installed? If so, then I believe the python bytecode implementation should be the same regardless of the version of Stata, correct?

Regarding #2: I wonder if the problem size is large enough that the 12-core implementation is worth the overhead associated with starting a bunch of separate threads or possibly starting virtual threads? This also raises another question: Is -tuples- written in such a way that it is parallelizable?
Comment
daniel klein

Join Date: Mar 2014

Posts: 3859
#4

28 Sep 2022, 09:44

Originally posted by Daniel Schaefer View Post

Regarding #1: Is it possible there are other background processes (or on a server, other users) that might be consuming system resources and interfering with the second test? If so, then if you repeat the test there should be variation in the total time each procedure takes.

No. There are two sessions of Stata running, 16 and 17. That is it. Obviously, timings change from run to run but the puzzling result that python script appears to be slower on Stata 17 than on Stata 16 (by quite a margin) remains stable.

Originally posted by Daniel Schaefer View Post

I would assume you only have one python 3 environment installed? If so, then I believe the python bytecode implementation should be the same regardless of the version of Stata, correct?

Yes, and I can confirm this via

Code:

python import sys print(sys.version)

in both instances of Stata. I wonder, however, why the version reported by python query differs across the versions of Stata (3.10.0 vs. 3.10.4).

Originally posted by Daniel Schaefer View Post

Regarding #2: I wonder if the problem size is large enough that the 12-core implementation is worth the overhead associated with starting a bunch of separate threads or possibly starting virtual threads? This also raises another question: Is -tuples- written in such a way that it is parallelizable?

I agree that Hemanshu Kumar should probably increase the size of the item set; I would not read too much into the difference between 1.8 and 1.5 seconds on Stata 16. However, it is striking that python script takes more than 2 seconds on Stata 17. And, no, we have not set up tuples with parallelization in mind; I have no idea what the Mata (or Python) compiler is able to do in this regard.

Last edited by daniel klein; 28 Sep 2022, 09:47.
Comment
Hemanshu Kumar

Join Date: Mar 2015

Posts: 1409
#5

28 Sep 2022, 10:01

As suggested, I changed the item set

Code:

numlist "1/20"

and got perhaps more meaningful results. Unlike #1, now I get python being by far the superior method in both versions of Stata.

In Stata 17/MP 12-core:

Code:

. timer list 1: 13.98 / 1 = 13.9850 2: 25.66 / 1 = 25.6560

In Stata 16/MP 2-core:

Code:

. timer list 1: 10.09 / 1 = 10.0900 2: 24.00 / 1 = 23.9990

Stata 16/MP 2-core still edges out Stata 17/MP 12-core, and by a significant margin!
1 like
Comment
daniel klein

Join Date: Mar 2014

Posts: 3859
#6

28 Sep 2022, 10:06

Originally posted by Hemanshu Kumar View Post

Unlike #1, now I get python being by far the superior method in both versions of Stata.

This is what I would expect.

Originally posted by Hemanshu Kumar View Post

Stata 16/MP 2-core still edges out Stata 17/MP 12-core, and by a significant margin!

And this is what bothers me.

Thanks for replicating this.
Comment

Announcement

Is -python script- much slower in Stata 17 than in Stata 16?

Comment

Comment

Comment

Comment

Comment