Speeding up random parameter mixed-logit estimation (cmxtmixlogit) with large dataset

Paul Lohmann

Join Date: Jun 2018

Posts: 5
#1

Speeding up random parameter mixed-logit estimation (cmxtmixlogit) with large dataset

20 Jan 2021, 06:18

Dear Stata Users,

I am wondering if anyone has a strategy to speed up the estimation of the cmxtmixlogit command?

I am estimating a set of choice models using a large panel dataset with approximately 60,000 cases and 200,000 observations. My preferred specification includes both fixed and random parameters, as well as a set of case-specific covariates.

Although I have also considered user-written commands such as mixlogit, I am using cmxtmixlogit due to its post-estimation capabilities with margins and its ease of specifying case-specific variables. I am aware that the cmxtmixlogit is not parallelized and hence using a multi-core computer with Stata MP (with e.g. 8 cores) or even a computing cluster would not improve the estimation speed. Due to the large sample size, it is also not feasible to set a small number of intpoints, as the models will not converge. The simulation procedure takes especially long when including additional (case-specific) covariates, such as week fixed effects (24 weeks x 5 alternatives) which amounts to a large number of covariates. So far, I have not been able to successfully estimate a model using all data and all covariates (and I have let the command run for up to a week).

I am hoping that there is something that I can tweak to be able to use the cmxtmixlogit command with my data. Any suggestions would be greatly appreciated.

Thanks,
Paul
Tags: None
Andrew Musau

Join Date: Oct 2014

Posts: 10482
#2

20 Jan 2021, 08:19

If you can estimate the same model with a faster command, then do so and then feed the estimates to cmxtmixlogit. You need to make sure that the equation names and column names match across the commands when specifying the starting values.

Code:

mixlogit .... mat b= e(b) cmxtmixlogit ..., from(b, skip)
Comment
Hong Il Yoo

Join Date: Jan 2015

Posts: 292
#3

20 Jan 2021, 08:41

With that many data points, I don't think it is possible to reduce the computer run time dramatically. If it's any comfort, you'll quickly get used to going for a week or two without seeing a single set of estimates. Just another day in the life of choice modellers. I haven't used -cmxtmixtlogit- yet as I don't have an active project in multinomial choice modelling but here're a few (uninformed) suggestions that I'd like to share.

(1) By default -cmxtmixlogit- simulates choice probabilities using Hammersley sequences. Perhaps you can combine -intmethod(halton)- or -intmethod(halton, antithetics)- with a smaller number in -intpoint()- to see if using alternative draw types can help convergence.

(2) I don't know if -cmxtmixlogit- uses an analytic or numerical Hessian when it applies its default -technique(nr)-. Assuming that it relies on the numerical Hessian, you may speed up the estimation run by using -technique(bhhh)- or -technique(bfgs)- that does not rely on the numerical Hessian.

(3) You can use Matthew Baker's -bayesmlogit- (you have to -ssc install- it first) to apply an MCMC approach to estimate your baseline model specification that only includes random coefficients. Then you can pass the results as starting values to -cmxtmixlogit- to obtain the MSL estimates and take advantage of the official command's postestimation options. You can then use the MSL estimates of your baseline specification as starting values for your full model specification that includes both random and fixed coefficients. The Bayesian or MCMC approach may estimate the mixed logit model considerably faster than MSL + gradient-based optimisation techniques, provided that every coefficient in the model is a random coefficient. So the idea here is to use the faster approach to get close to a maximum in your simulated likelihood function, and then use the conventional approach to actually reach that maximum.
Comment
Paul Lohmann

Join Date: Jun 2018

Posts: 5
#4

31 Jan 2021, 15:16

Thank you both for your suggestions! I have been trying them over the past couple of weeks and just wanted to follow up with some feedback:

(1) It seems that -intmethod(halton, antithetics)- is helping convergence, at least in some slightly less complex models that I have been running with fewer draws (although I have no clear counterfactual).

(2) Both alternative techniques (bhhh) & (bfgs) are substantially faster, but I have not managed to get the models to converge. They usually end up getting stuck - (backed up) - after a few iterations and this has also been the case with smaller samples and less complex models.

(3) Obtaining starting values from faster commands (-bayesmlogit- or -mixlogit-) works, but once I use -cmxtmixlogit- and add too many covariates as casevars, it is still incredibly slow. So the solution really is too avoid specifying too many covariates. For example, as mentioned in my original post 23 week dummies x 5-1 alternatives is just too many and I have been trying to work around that.

Anyway, using -intmethod(halton, antithetics)- and with a lot of patience, I have been getting some results! Thank you!
1 like
Comment
Hong Il Yoo

Join Date: Jan 2015

Posts: 292
#5

01 Feb 2021, 02:15

That's great to hear! Yes a lot of patience is what we need when working with this type of application I forgot to mention, another kludge you may try in the future is the like of -technique(bfgs 15 nr 5)- which asks Stata to use bfgs for 15 iterations and nr for 5 iterations; it sometimes helps the numerical solver to skip past flat regions of the log-likelihood function. There's nothing special about the numbers 15 and 5, you can experiment with different combinations until the dough feels right. The -(backed up)- messages are ok as long as the log-likelihood value actually does improve over iterations (no matter how marginally) and they don't pop up during the last few iterations before Stata declares convergence.
1 like
Comment

Announcement

Speeding up random parameter mixed-logit estimation (cmxtmixlogit) with large dataset

Comment

Comment

Comment

Comment