Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adding random intercepts in CMP slows down estimation a lot and makes required RAM exceed 500GB

    I'm running a 4-equation model (2 x Heckman selection model) with CMP and I want to add random intercepts (maybe later also random slopes). Without any random effects, the model runs fast (4 hours for a N = 700,000 dataset and ~ 10 predictors per equation). I'm running the model on a Linux server and it consumes at maximum 2GB of RAM.

    When I start adding a random intercept (the data belongs to 17,000 individuals) to each of the four equations, after a few minutes, Stata consumes more than 500GB of RAM (which is the maximum I get allocate to the job) and it aborts the estimation.

    I've tried adding random intercepts to only two out of the four equations, then the model consumes at maximum 22GB of RAM but has not finished yet after 3.5 days.

    I'm wondering why adding the random intercepts has such an enormous effect on the computational effort of the model.

    Any ideas what's going on?

  • #2
    Hi Jochen. A narrow answer to your question is that estimating random effects models requires numerically estimating lots of integrals over the possible values of the random effects for each of the 17000 groups. And the complexity explodes if you have multiple potentially correlated effects, for then the integrals, in your case, are over four-dimensional space. You might compare the methods & formulas sections in the manual entries for probit and xtprobit and notice the complexity in the latter, just for a one-dimensional problem.

    The one practical suggestion is to use the intpoints() option to reduce the precision of the computations. You could try intpoints(6 6 6 6) for a 4-D problem, or even try smaller.

    Comment

    Working...
    X