bootstrapping takes forever in cmrologit

Max Hartz

Join Date: Dec 2023

Posts: 6
#1

bootstrapping takes forever in cmrologit

07 Dec 2023, 01:31

Dear Statalist,

we have the following setup of data:
Survey respondents do a multiple choice between 18 alternatives (3 votes, up to 2 votes per alternative).
We then regress choice on characteristics of the alternatives, with a cmrologit model.

Notably, the choice table is experimentally manipulated in four treatment groups, which leads to a display of the choice table.

We have 112,824 data rows, with one choice set each from 6268 survey respondents.

We first run individual cmrologit models for every treatment subgroup.

Code:

cmset clustervar, noalternatives forval i = 1/4 { eststo m`i': cmrologit depvar i.indepvar1 i.indepvar2 i.indepvar3 c.indepvar4 c.indepvar5, /// incomplete(0) ties(exactm) vce(bootstrap, cluster(clustervar) reps(500) /// , if treatmentindicator == `i' }

This takes about 2-3 days to run per individual model, but then gives us seemingly adequate output.

To gauge the statistical significance of the coefficients of our independent variables between treatment groups, we then collapse these four models into one, interacting the characteristics of the alternatives with the four-categorical treatment indicator.

Code:

cmset clustervar, noalternatives eststo: cmrologit depvar (i.indepvar1 i.indepvar2 i.indepvar3 c.indepvar4 c.indepvar5)##i.treatmentindicator, /// incomplete(0) ties(exactm) vce(bootstrap, cluster(clustervar) reps(500) }

This model starts to run, but with over a week running only 7 bootstrap replications have been completed.

What could be going on here? Why is this taking so much longer?

Any thoughts are much appreciated.

Best
Tags: None
Felix Bittmann

Join Date: Aug 2018

Posts: 728
#2

07 Dec 2023, 03:23

I cannot say anything about the formulas used but my simple tests show that the interaction model runs about 17x longer than the sum of all individual models using the if-qualifier. I suppose that the larger total sample sizes has some nonlinear increases in the number of comparisons/computations needed. Maybe you can simply use the CIs to gauge the statistical difference of the results using the if-method. Or use parallel (https://github.com/gvegayon/parallel) to speed things up.

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
Comment
Max Hartz

Join Date: Dec 2023

Posts: 6
#3

07 Dec 2023, 06:59

Thanks, the factor x17 in your tests is already an interestic indication that at least we did might not have messed up after all when setting this up...
Comment

Announcement

bootstrapping takes forever in cmrologit

Comment

Comment