General question about Stata performance

Nikolay Antonov

Join Date: Jul 2014

Posts: 16
#1

General question about Stata performance

10 Jul 2015, 06:30

Dear Stata users,

I am working on a local server and running some grid search. My do-file is very basic:based on some parameter values, I generate simulated data and get the closest match to an observed data by some criteria. A bunch of gen, sort, replace, keep, define locals commands wrapped by "preserve" and "restore" for each of say 12 000 iteration.

Once I start my code, the following happens: the first few hundreds of iterations are lightning fast, take less than a second. Then, as time goes by, they get slower and slower. At the 1600th iteration it takes 15s or so per iteration!
If I start my code directly from the 1600th iteration, it processes it in the usual quick fashion for less than a second. So it is not that iterations themselves become more computationally intensive, it must be due to Stata's way of processing stuff.

One of my hopes was that maybe I am doing some other operations on the Company server that "steal" CPU, and this is the cause of my troubles. I tried using AWS. Set up a VM solely for the purpose of running this programme and nothing else. Same story, same pattern over time.

How could one escape from that? Is it possible at all? Is the programme storing some useless info that I can get rid of to improve performance?

Any opinion is appreciated.

Last edited by Nikolay Antonov; 10 Jul 2015, 06:35. Reason: grid search
Tags: None
Attaullah Shah

Join Date: Aug 2014

Posts: 1669
#2

10 Jul 2015, 07:25

It would easy for us if you paste your code or sample of it. Since Stata keeps data in its memory, maybe your code forces Stata to keep too much in its memory.

Regards
--------------------------------------------------
Attaullah Shah, PhD.
Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
FinTechProfessor.com
https://asdocx.com
Check out my asdoc program, which sends outputs to MS Word.
For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.
Comment
Nikolay Antonov

Join Date: Jul 2014

Posts: 16
#3

10 Jul 2015, 07:50

forval b_CL = -0.4(0.1)0.4 {
forvalues ib = -5 (1) 5 {
local b_DEN = `ib'/100
forval ic = -5(1)5 {
local b_theta = `ic'/1000

forval id = 9700(20)9900 {
local a_theta = `id'/10000

gen theta_e = `a_theta' + `b_theta'*(dm)

replace theta_e = 0.995 if theta_e > 0.995

gen diff_theta = abs(theta - theta_e)

local benchmark_`group' = 50

timer on 1

forval a_CL = 5(1)15 {

forval ia = 3/9 {
local a_DEN = `ia'/10

qui {

preserve

gen CL_e = `a_CL' + `b_CL'*(dm)

replace CL_e = 5 if CL_e < 5

gen DEN_e = `a_DEN' + `b_DEN'*(dm)

replace DEN_e = 0.1 if DEN_e < 0.1

keep if (round(CL_e*100,100) == CL * 100)

gen diff_DEN = abs(DEN-DEN_e)

sort group cohort diff_theta diff_DEN

bys group cohort: keep if _n == 1

count

if `r(N)' == 13 {

gen weight_SSE = sum( w * SSE_total)

local result = weight_SSE[_N]

if `result' < `benchmark_`group'' {

local SSE_`group'_`seq' = weight_SSE[_N]

local benchmark_`group' = `result'

}

}

else {
pause Check
}

restore

local ++i

}

}

}

}

capture drop theta_e
capture drop diff_theta
capture drop diff_DEN

timer off 1

timer list

local ++seq
local ++col

}

}

}

}

}

I paste it here and also attach it as a txt doc

Attached Files

Grid Search Core code.txt (2.2 KB, 1 view)
Comment
Nikolay Antonov

Join Date: Jul 2014

Posts: 16
#4

10 Jul 2015, 13:48

Update:

- 1) tried to track memory usage by displaying c(memory) at the end of each iteration ==> does not change at all.

- 2) tried to remove preserve and restore by saving temporary datasets, load them at beginning, erase at end of each iteration ==> time is increasing at a slower pace, but the general trend is still there.

- 3) dropped all macros that are iteration-specific at iteration-end ==> same as 2).

What other temp stuff could be there that I am missing and that slows down execution? Or is it that you can only optimize so much?
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#5

10 Jul 2015, 14:53

Consider the following, which may reduce the number of observations going through the innermost parts of the process. By doing the sort outside the innermost 77 loops, the keep added within the loops will not require a sort.

Code:

... gen diff_theta = abs(theta - theta_e) sort group cohort diff_theta local benchmark_`group' = 50 timer on 1 forval a_CL = 5(1)15 { forval ia = 3/9 { local a_DEN = `ia'/10 qui { preserve bys group cohort (diff_theta): keep if diff_theta == diff_theta[1] ...

I know this doesn't address the question of why the process slows with advancing iterations, but perhaps it will help make it less noticeable.

Also, when you say you are running on a local server, do you mean a system shared by multiple users running multiple tasks? If so, then it is possible that the server's dispatching routine is reducing the priority, and share of resources, allocated to your task as the total time increases, to give shorter tasks better turnaround.

Last edited by William Lisowski; 10 Jul 2015, 14:58.
Comment
Nikolay Antonov

Join Date: Jul 2014

Posts: 16
#6

10 Jul 2015, 16:06

Thanks for the suggestion! I would implement straightaway.

Regarding the local server, you are correct - it is used by multiple users. Hence the idea to go with the Amazon Cloud. I set up my own "server" there solely for the purpose of running this programme, therefore isolating all these potential effects you have with many users. Without any notable success unfortunately.
Comment
Sergiy Radyakin

Join Date: Apr 2014

Posts: 1867
#7

10 Jul 2015, 23:09

What are your variables? Can you share a subsample or an artificial data for executing together with the code?
Get rid of preserve/restore completely. Why do you need them at all?
Use tempvar for theta_e, diff_theta, diff_DEN
Break down your program into subroutines.
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#8

11 Jul 2015, 00:28

Nikolay - As presented, your code apparently does not reset the timer to zero after reporting it. From help timer we see

timer on begins a timing. timer off stops a timing. A timing may be turned on and off repeatedly without clearing, which causes the timer to accumulate.

My apologies for not noticing this earlier. Blame it on unfamiliarity with the details of the timer command.

With that said, there is also much that is confusing to me about your code, and it feels to me as if you have simplified part of it for presentation here, and in the process perhaps omitted a timer clear command that would have reset the timer to zero after reporting it (or before turning it on). Or perhaps the times you reported to us were the differences between successive values. If indeed the times reported were for individual passes through the inner loops and not accumulatied times for all completed passes through the inner loops, please report back and I'll ask some followup questions.

Last edited by William Lisowski; 11 Jul 2015, 00:46.
Comment
Nikolay Antonov

Join Date: Jul 2014

Posts: 16
#9

11 Jul 2015, 15:07

Guys, thank you very much for your suggestions!

Sergiy Radyakin : I'll try to upload some sample data. To give you some vague idea, the dataset is small, around 1mb. I used preserve/restore since I drop observations, therefore I need my initial dataset at the beginning of each iteration. I removed preserve and restore by loading the dataset at each iteration, outcome does not change much.

William Lisowski : True, I do not reset the timer to zero. On the other hand, timer used in my way reports the average time it takes for one iteration, and that either stays the same or grows constantly. It is not influenced by, say, an outlier that takes 5 minutes and inflates the average (which would then decrease, other things equal). So this should mean that over time, it takes more and more time for each iteration to complete. But I'll certainly take your advice and reset timer each time.

I tried to play around with the memory settings. Turned out newer versions of Stata (I am using Stata 13 MP) do not accept the command "set memory #" as they do that automatically. But one can change the segmentsize, which in turn changes the minimum value of max_memory (=2*segmentsize). Strangely enough, lowering the segment size lowered the time for my iterations by about .5s, but this might be due to luck since I cannot come up with a rational story about this.
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#10

11 Jul 2015, 17:46

Nikolay Antonov: I see, timer list reports the total, the number of cycles, and the average per cycle, and it is the average that you find is growing.

Have you confirmed that the number of observations kept by the condition on CL and CL_e is not increasing as the number of iterations increases, not because of the number of iterations, but because of the different combination of values of b_CL, b_DEN, b_theta, and a_theta that constitute your outermost loops that surround the timed portion of the program? You might try adding the following three lines just after the keep statement I just mentioned

Code:

quietly describe local nobs = r(N) local nvar = r(k)

and then add the following just before or after the timer list command

Code:

display "nobs= `nobs' nvar= `nvar'"

That should make sure that something isn't causing the amount of work the program is performing in the innermost loop, notably by the sorting, to unexpectedly grow.

If that should be the problem, perhaps the sort in the innermost loop can be replaced by creative use of egen to find the minimum value (by group and cohort) of diff_theta, and then of diff_DEN (by group and cohort, if diff_theta is the minimum for that group and cohort). You're possibly moving a lot of data about to locate a single observation. Also, it's possible that the describe command might be quicker than the count command for getting the total number of observations in the data.

In any event, I continue to be curious about the growth in run time and understand and appreciate your interest in resolving that. I've always found that if I put aside a problem I cannot understand, hacking my way around it somehow, it without fail will come back to plague me in some other setting at the worst possible time.
Comment
Nikolay Antonov

Join Date: Jul 2014

Posts: 16
#11

11 Jul 2015, 18:49

Hi William,

I do believe that all the iterations are identical for the following reason: if I set the loop parameters so as to start directly from what is otherwise the 2000th iteration (which normally takes much longer), it takes the same short time as in the beginning of the programme. So I don't believe that what is happening is due to data across iterations becomes larger or takes longer to process.

I would attach samples of my data and the complete do-file that I am using. If you want, you can have a look at it, I guess that would be easier.

Attached Files

Grid.do (6.1 KB, 1 view)

observed_XZ_xy_cohorts.xlsx (103.5 KB, 1 view)
Comment
Nikolay Antonov

Join Date: Jul 2014

Posts: 16
#12

11 Jul 2015, 18:58

For some reason I cannot upload the other Excel file, says "Error uploading image", it is 8MB, no idea what is the limit here, split it into several files 3MB each, does not work either...
Comment
Nikolay Antonov

Join Date: Jul 2014

Posts: 16
#13

11 Jul 2015, 19:01

Maybe this would do:

https://www.dropbox.com/s/n7647vnw7i...1_sim.dta?dl=0
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#14

11 Jul 2015, 20:02

Nikolay -

I successfully downloaded your files, including the Stata data file on Dropbox containing your group1 data. I've again reviewed the code, in light of the data, and one thing continues to escape my understanding. You have formulae that include terms dm, w and SSE_total: they are not variables in the group1 data. I note that outside the program you, at a minimum, must set the values of the local macros group and seq. Do you create the three variables at that time as well? I don't think they're scalars, the formulae don't make sense to me under that interpretation, but then, maybe that's my fault as well.

Apologies for my apparent denseness on this.

Bill
Comment
Nikolay Antonov

Join Date: Jul 2014

Posts: 16
#15

12 Jul 2015, 09:39

Hi William,

Yes, I do create vars such as dm_cohort, w, and SSE_total before the nested loops. If you check the do-file "Grid" that I also uploaded, you would see where I do this, rows 132:150.
Comment

Announcement

General question about Stata performance

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment