I am trying to optimize the speed of the user-written -synth_runner- command using 8-core StataMP 15.1 on a Mac with 8 cores and 16 GB of physical memory.
I ran a simulation where I varied the number of clusters from 1 to 8 and also performed a non-parallelized version of the analysis (all code at bottom).
Here are the results, where the timer # corresponds to the number of clusters. Timer 10 is the non-clustered version.
I am struggling to understand why the time does not increase very much after 3 cores. I also get similar results when I used the nested optimization option (though obviously all the times are longer).
Here's the code:
I ran a simulation where I varied the number of clusters from 1 to 8 and also performed a non-parallelized version of the analysis (all code at bottom).
Here are the results, where the timer # corresponds to the number of clusters. Timer 10 is the non-clustered version.
Code:
. timer list 1: 43.59 / 1 = 43.5920 2: 25.23 / 1 = 25.2340 3: 20.99 / 1 = 20.9890 4: 20.11 / 1 = 20.1050 5: 19.37 / 1 = 19.3670 6: 19.36 / 1 = 19.3550 7: 20.06 / 1 = 20.0600 8: 19.27 / 1 = 19.2720 10: 77.37 / 1 = 77.3670
Here's the code:
Code:
set more off
capture trace off
clear all
cls
cap drop pre_rmspe post_rmspe lead effect cigsale_synth
cap drop cigsale_scaled effect_scaled cigsale_scaled_synth D
cap program drop my_pred my_drop_units my_xperiod my_mspeperiod
program my_pred, rclass
args tyear
return local predictors "beer(`=`tyear'-4'(1)`=`tyear'-1') lnincome(`=`tyear'-4'(1)`=`tyear'-1')"
end
program my_drop_units
args tunit
if `tunit'==39 qui drop if inlist(state,21,38)
if `tunit'==3 qui drop if state==21
end
program my_xperiod, rclass
args tyear
return local xperiod "`=`tyear'-12'(1)`=`tyear'-1'"
end
program my_mspeperiod, rclass
args tyear
return local mspeperiod "`=`tyear'-12'(1)`=`tyear'-1'"
end
timer clear
timer on 10
use smoking, clear
tsset state year
gen byte D = (state==3 & year>=1989) | (state==7 & year>=1988)
synth_runner cigsale retprice age15to24, d(D) pred_prog(my_pred) trends training_propr(`=13/18') ///
drop_units_prog(my_drop_units) xperiod_prog(my_xperiod) mspeperiod_prog(my_mspeperiod) deterministicoutput ///nested
effect_graphs
pval_graphs
timer off 10
forvalues p = 1(1)8 {
timer on `p'
parallel clean, all
parallel setclusters `p'
use smoking, clear
tsset state year
gen byte D = (state==3 & year>=1989) | (state==7 & year>=1988)
synth_runner cigsale retprice age15to24, d(D) pred_prog(my_pred) trends training_propr(`=13/18') ///
drop_units_prog(my_drop_units) xperiod_prog(my_xperiod) mspeperiod_prog(my_mspeperiod) parallel deterministicoutput ///nested
effect_graphs
pval_graphs
timer off `p'
}
timer list

Comment