posterior means for crossed random effects from logistic random intercept model

Stefan Sacchi

Join Date: Dec 2019

Posts: 4
#1

posterior means for crossed random effects from logistic random intercept model

24 Dec 2019, 06:09

dear statalist members,

I am trying to estimate the posterior means of two crossed random effects from a logistic random intercept model (estimated with melogit). Although the model is quite huge (some 24000 observations, around 25 individual-level covariates and two crossed random effects), estimation nevertheless converges without problems after some time (around 50 hours). However, when I try to calculate posterior means of both random effects(with predict re_*, reffects) this takes an appearingly endless amount of time (currently 15 days).

To check how fast this type of postestimation works with a much smaller problem, I have created an artificial data set with just 200 observations, two individual-level covariates and two crossed random effects for two grouping variables with 12 and 18 categories. Model estimation runs smoothly and the results conform to what I would expect, given the structure of the artificial data (all syntax and output below). However, the calculation of the posterior means again takes a lot of time and has not come to an end by now (after approximately 3 days). Although the user is warned that "computing empirical Bayes means for a crossed-effects model is very time consuming", calculation seems extremely slow, given the apparently limited size of the estimation problem (?).

I should probably mention that I have no problems to estimate posterior modes (with predict …, reffects rmodes).

It would appreciate very much any comments and ideas to the following points:
1. Is there some fundamental issue with the calculation of posterior means for crossed random effects, I should be aware of?
2. Has anybody successfully used "predict …, reffects" after a logistic mixed model with crossed random effects? What's the experience regarding the amount of time needed?
3. Are there any alternative (and preferably faster) estimation methods for posterior means in this type of situation?

Thanks a lot for any suggestions!
Stefan

My do-file:

Code:

clear set obs 200 gen x1 = rnormal(0,1) gen x2 = rnormal(0,1) gen u1 = rnormal(0,1) gen u2 = rnormal(0,1) correlate x* u* xtile re1 = u1, nq(12) xtile re2 = u2, nq(18) by re1, sort: egen postmean1 = mean(u1) by re2, sort: egen postmean2 = mean(u2) gen e = rnormal(0,1.5) gen z = (0.7*x1)+(-0.5*x2)+postmean1+postmean2+e gen prob = 1/(1+exp(-1*z)) sum prob recode prob (min/0.6=1)(*=0), gen(d) melogit d x1 x2 || _all:R.re1 || re2:, diff * started ca 15:30 pm 21. 12. 2019 predict pre*, reffects

Model-Output:
Tags: None
Joseph Coveney

Join Date: Apr 2014

Posts: 4420
#2

24 Dec 2019, 18:48

I'm guessing that you're going to have to approximate the integrals using stochastic methods.

bayes: works with melogit and so that with flat* priors would be an easy way to look into first. I don't know whether you'll be able to get estimates of the posterior means for individuals using that

If not, then you could look into explicitly modeling individual effects in order to force their estimation. You can do that using bayesmh and calling your user-written likelihood.

* Although there's probably no real need to insist on flat priors, and convergence might be aided with judicious choices of weakly informative ("regularizing") priors.
Comment
Stefan Sacchi

Join Date: Dec 2019

Posts: 4
#3

13 Jan 2020, 05:32

Thanks a lot for the suggestions (and sorry for the slow response)

Unfortunately, I am not familar with "bayes" (and I guess it would take some time to familiarize).
Actually, I tend to use posterior modes (instead of posterior means), which are calculated very quickly even if the crossed random intercept model is huge (with predict option "ebmodes"). .
I would suspect that posterior modes and means should be more or less the same, as long as the underlying distributions are unimodal and symmetric.
Comment
Stefan Sacchi

Join Date: Dec 2019

Posts: 4
#4

13 Jan 2020, 08:31

Perhaps I should add that the rather small simulation described above is still running (i.e. since more than 3 weeks on a notebook with 8 GB RAM and a 2.30GHz processor).
Given the apparently small size of the problem, it seems to me that stata's implemented warning notice ("computing empirical Bayes means for a crossed-effects model is very time consuming") is a bit of an understatement.
Comment

Announcement

posterior means for crossed random effects from logistic random intercept model

Comment

Comment

Comment