parallel usage

Dave Airey

Join Date: Apr 2014

Posts: 406
#1

parallel usage

04 Aug 2025, 08:32

I sometimes run many regressions on each of 1000s of genes or proteins where subjects from an experiment are measured for each protein. I sometimes use Stata, sometimes R. In this case I'm using Stata because I want an interval censored Cox model. Stata offers a better implementation of this model in my opinion than R solutions. My question has to do with whether I can use the parallel package. Below is example code, that uses third party commands xsvmat and xframeappend to grab r(table) from each model and append into one results frame. xsvmat and xframeappend are from Roger Newson. To speed this up, I broke the 2755 genes into groups of 500 or so and ran those subsets in 5 separate Stata instances because I have a Windows laptop with 32gb of RAM and 14 cores. I manually appended the results into a single results file and ran model diagnostics on top biomarkers. While this approach worked just fine, I was curious about the use of parallel.

Has anyone had a similar problem dataset (long formatted dataset with many endpoints) and used parallel on a Windows machine to collect model results for each endpoint?

Code:

frame reset use mydata.dta, clear frame create cox_results local setA = "c.x1 i.x2" quietly { forvalues i = 1/2755 { noisily display `i' stintcox baseline_ntx `setA' if gene_id == `i', /// interval(ltime rtime) favorspeed xsvmat, from(r(table)') rowname(parm) names(col) idnum(`i') /// frame(outframe, replace) frame change cox_results xframeappend outframe, fast frame change default } } frame change cox_results save cox_setA_cov_results.dta, replace /* The mydata.dta variables are: gene_id subject ltime rtime baseline_ntx x1 x2 2755 gene_id, ~800 subjects measured for each protein (gene_id) A cox model is run for each gene_id using a for loop */

Last edited by Dave Airey; 04 Aug 2025, 08:49.
Tags: None
George Vega

Join Date: May 2014

Posts: 14
#2

04 Aug 2025, 09:28

Hi Dave,

You can certainly do something like we do in this example. You need to write a program that holds all the logic, iterating through gene ids after listing them. It would be something like this (my Stat is a bit rusty!):

Code:

prog def parfor /* some code to list available genes */ /* and store it in genelist */ foreach i in $genelist { ... your fancy code ... ... A step to save the i-th gene results... } end parallel initialize 10 parallel, prog(parfor): parfor /* Collect all the results */
1 like
Comment
Dave Airey

Join Date: Apr 2014

Posts: 406
#3

04 Aug 2025, 09:37

Thank you George Vega! So it is possible. I will give it a try. Cheers!
1 like
Comment

Dave Airey

Join Date: Apr 2014
Posts: 406

04 Aug 2025, 13:18

It worked great George Vega! Just trying 500 genes in a stintcox model with no covariates and using 8 cores, the parallel code ran in 53 seconds whereas the sequential code ran in 184 seconds, 3.5x difference. Pretty nice.

Code:

clear all
use mydata.dta, clear
keep if gene_id <= 500

program define parfor
levelsof gene, local(mygenes)
global mygenes = r(levels)
foreach i of global mygenes {
    stintcox baseline_ntx if gene == "`i'", /// 
        interval(ltime rtime) favorspeed
    xsvmat, from(r(table)') rowname(parm) names(col) idstr("`i'") /// 
        saving(.\parallel\gene_`i', replace)
}
end

timer clear
timer on 1
parallel initialize 8
parallel, prog(parfor): parfor
timer off 1
timer list

use mydata.dta, clear
keep if gene_id <= 500
levelsof gene, local(mygenes)
global mygenes = r(levels)
timer on 2
quietly {
foreach i of global mygenes {
    stintcox baseline_ntx if gene == "`i'", /// 
        interval(ltime rtime) favorspeed
    xsvmat, from(r(table)') rowname(parm) names(col) idstr("`i'") /// 
        saving(.\parallel\gene_`i', replace)
}
}
timer off 2

timer list

. timer list
1: 53.05 / 1 = 53.0550
2: 184.05 / 1 = 184.0460

Last edited by Dave Airey; 04 Aug 2025, 13:21.

Comment

Dave Airey

Join Date: Apr 2014

Posts: 406
#5

04 Aug 2025, 13:26

Hmmm.... this should be in StataNow! A nice complement to Stata MP, which I have and like. Oh, and I did check results against each other and they were the same.

Last edited by Dave Airey; 04 Aug 2025, 13:51.
Comment
Dave Airey

Join Date: Apr 2014

Posts: 406
#6

Yesterday, 10:06

Actually when I checked more carefully by a merge operation with assert commands, I discovered that 4/500 of the genes' data had been split across cores. There are 800+ observations on the same subjects per gene. This results in models being estimated in part of the 800 observations per gene. So the parfor program needs more information to split the data across cores but by gene.
Comment
Felix Bittmann

Join Date: Aug 2018

Posts: 731
#7

Yesterday, 11:12

I think you can solve the issue by just reloading the original dataset within your program. By doing so, each gene is tested on the original, unsplit dataset.

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
Comment

Dave Airey

Join Date: Apr 2014
Posts: 406

Yesterday, 11:36

Thanks Felix Bittmann , I solved this just now using a different method, but one illustrated in George Vega's gallery. I used one of the parallel macros "$pll_instance". I created a variable pid assigning an integer from 1 to the number of cores I wanted to each gene randomly. I then used that in the program to subset the data to a core. Now all my genes get processed and match the sequential results using a series of assert commands (not shown).

Code:

clear all
use mydata.dta, clear
keep if gene_id <= 500

bysort gene: generate pid = runiformint(1,8) if _n == 1
bysort gene: replace pid = pid[1]
save mytempdata.dta, replace

program define parfor
use if pid == $pll_instance using mytempdata.dta, clear
levelsof gene, local(mygenes)
global mygenes = r(levels)
foreach i of global mygenes {
    stintcox baseline_ntx if gene == "`i'", ///
        interval(ltime rtime) favorspeed
    xsvmat, from(r(table)') rowname(parm) names(col) idstr("`i'") ///
        saving(.\parallel\gene_`i', replace)
}
end

timer clear
timer on 1
parallel initialize 8
parallel, prog(parfor): parfor
timer off 1
timer list

use mytempdata.dta, clear
levelsof gene, local(mygenes)
global mygenes = r(levels)
timer on 2
quietly {
foreach i of global mygenes {
    stintcox baseline_ntx if gene == "`i'", ///
        interval(ltime rtime) favorspeed
    xsvmat, from(r(table)') rowname(parm) names(col) idstr("`i'") ///
        saving(.\parallel\gene_`i', replace)
}
}
timer off 2

timer list

Last edited by Dave Airey; Yesterday, 11:40.

Comment

Dave Airey

Join Date: Apr 2014

Posts: 406
#9

Yesterday, 12:54

Actually, George pointed out offline there is a by option that keeps the by variable from being split up across child processes. Slapped forehead. That will work too.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment