Hi,
I'm trying to split a master dataset into its constituent country parts. I'm using large datasets (300g) that currently are taking more than 48 hours to run, so any help speeding the process would be much appreciated.
Using this data:
and this Stata code:
Thanks
Ciaran
I'm trying to split a master dataset into its constituent country parts. I'm using large datasets (300g) that currently are taking more than 48 hours to run, so any help speeding the process would be much appreciated.
Using this data:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str20 bvd_id_number strL main_activity str2 countrycode "CN9360430024" "Manufacturing" "CN" "AU072891993" "Services" "AU" "US149668182L" "Manufacturing" "US" "US133096011L" "Services" "US" "CA32531NC" "Services" "CA" end
and this Stata code:
Code:
//create country list
glevelsof countrycode, local(countries)
//timer on
timer on 1
parallel: foreach c of local countries {
use overviews.dta, clear
keep if countrycode == "`c'"
save `c', replace
}
timer off 1
timer list 1
Code:
//timer on
.
. timer on 1
.
. parallel: foreach c of local countries {
--------------------------------------------------------------------------------
Parallel Computing with Stata (by GVY)
Clusters : 4
pll_id : rp2wznupm1
Running at : D:\Firmographics\overviews\parallell_test
Randtype : datetime
Waiting for the clusters to finish...
-3621
cluster 0004 has exited without error...
-3621
cluster 0001 has exited without error...
-3621
cluster 0002 has exited without error...
-3621
cluster 0003 has exited without error...
--------------------------------------------------------------------------------
Enter -parallel printlog #- to checkout logfiles.
--------------------------------------------------------------------------------
unlink(): 3621 attempt to write read-only file
parallel_recursively_rm(): - function returned error
parallel_clean(): - function returned error
<istmt>: - function returned error
r(3621);
end of do-file
Ciaran

Comment