multilevel logistic model via -bayesmh-

skolenik

Join Date: Mar 2014

Posts: 102
#1

multilevel logistic model via -bayesmh-

26 May 2015, 12:38

I am trying to implement a Bayesian version of melogit. [XT] uses toweroflondon.dta as a running example.

Code:

use http://www.stata-press.com/data/r14/towerlondon.dta, clear melogit dtlm i.group || subject: fvset base none subject bayesmh dtlm i.group i.subject, likelihood(logit) noconstant prior({dtlm:i.subject}, normal({dtlm:_cons},{var_subj})) prior({dtlm:_cons}, normal(0, 100)) prior({dtlm:i.group}, normal(0, 100)) prior({var_subj}, igamma(0.001, 0.001)) mcmcsize(5000) block({var_subj}, gibbs) block({dtlm:i.subj}) block({dtlm:i.group}) block({dtlm:_cons}, gibbs) bayesmh dtlm i.group i.subject, likelihood(logit) noconstant prior({dtlm:i.subject}, normal({dtlm:_cons},{var_subj})) prior({dtlm:_cons}, normal(0, 100)) prior({dtlm:i.group}, normal(0, 100)) prior({var_subj}, igamma(0.001, 0.001)) mcmcsize(5000) block({var_subj}, gibbs) block({dtlm:i.subj}) block({dtlm:i.group}) block({dtlm:_cons}, gibbs) dots

The results are problematic: convergence is not achieved, the dreaded "autocorrelation at 500 lags" warning is shown, and the results are nowhere near those from melogit. Note that I used blocking in the priors, but I could only use so much Gibbs sampling that saved Example 21 in [BAYES] bayesmh: my attempts to add , gibbs to the blocks for the outcome variable failed with an error message

Code:

Gibbs sampling not supported; Option gibbs not allowed in block(dtlm:1bn.subj ...) for the specified prior distribution of the parameters.

Is there anything I can do on my side to increase the efficiency of the sampler? I don't think this the issue of the number of parameters in the block (there are 200-something unique subjects) as this fails even with the two values of i.group.

-- Stas Kolenikov || http://stas.kolenikov.name
-- Principal Survey Scientist, Abt SRBI
-- Opinions stated in this post are mine only
Tags: Bayesian, bayesmh, melogit, multilevel
Joseph Coveney

Join Date: Apr 2014

Posts: 4406
#2

26 May 2015, 20:21

I've always got that error message when trying to use Gibbs sampling on factor variables. As far as improving the estimates, you can try splitting updates (i.e., one-at-a-time) of at least the subjects (they’re conditionally independent, anyway) and see whether that helps. Also, it will help to tighten up the prior on the coefficients—you can probably go with an SD of 5 or so on the normal prior for those (I’ve never seen a logistic regression coefficient larger than about that that wasn’t the result of separation or quasiseparation). I’ve tried these tweeks below and the regression coefficients for i.group (see manual calculations at the bottom) are fairly close to what melogit gets, given that (i) the random effects variance collapsed, (ii) the adaptation never did actually finish (when I model, I use adaptation(every(500)) burnin(`=<X>*500’) where X is at least 25, and usually more) and (iii) the chain is short.

Also, you might need to nudge the prior for the variance—the inverse-gamma I’ve read has problems on its own, and here the challenge is greater: the likelihood-ratio test shown by melogit suggests that the random effects for subject (variance 0.28 ± 0.26) is fairly weak. If the variance for the subjects collapses to zero, as it seems to have done in my output below, then you might need to use a noncentered parameterization for the subject effect, which I believe will require using the nonlinear regression form for bayesmh. So far, I’ve only used Stata’s Bayesian commands in pharmacometrics-type stuff (population pharmacokinetics etc.) and so I've never written it for a generalized linear model such as logistic regression, but in the models where I’ve used it to avoid collapsing of the random effects variance to zero, the noncentered parameterization for subjects has been something in the nature of

Code:

. . . + {dtlm:i.subject} * sqrt({var_subj}) + {mu_subj} . . .

in the likelihood equation, and then

Code:

prior({dtlm:i.subject}, normal(0, 1)) prior({var_subj}, jeffreys) prior({mu_subj}, flat)

.ÿversionÿ14.0

.ÿ
.ÿclearÿ*

.ÿsetÿmoreÿoff

.ÿ
.ÿuseÿhttp://www.stata-press.com/data/r14/towerlondon.dta,ÿclear
(TowerÿofÿLondonÿdata)

.ÿ
.ÿmelogitÿdtlmÿi.groupÿ||ÿsubject:ÿ,ÿnolog

Mixed-effectsÿlogisticÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿ677
Groupÿvariable:ÿÿÿÿÿÿÿÿÿsubjectÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿgroupsÿÿ=ÿÿÿÿÿÿÿÿ226

ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿObsÿperÿgroup:
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿminÿ=ÿÿÿÿÿÿÿÿÿÿ2
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿavgÿ=ÿÿÿÿÿÿÿÿ3.0
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmaxÿ=ÿÿÿÿÿÿÿÿÿÿ3

Integrationÿmethod:ÿmvaghermiteÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿIntegrationÿpts.ÿÿ=ÿÿÿÿÿÿÿÿÿÿ7

ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿWaldÿchi2(2)ÿÿÿÿÿÿ=ÿÿÿÿÿÿÿ7.84
Logÿlikelihoodÿ=ÿ-368.33244ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.0198
------------------------------------------------------------------------------
ÿÿÿÿÿÿÿÿdtlmÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
-------------+----------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿgroupÿ|
ÿÿÿÿÿÿÿÿÿÿ2ÿÿ|ÿÿ-.1130515ÿÿÿÿ.229383ÿÿÿÿ-0.49ÿÿÿ0.622ÿÿÿÿ-.5626339ÿÿÿÿ.3365309
ÿÿÿÿÿÿÿÿÿÿ3ÿÿ|ÿÿ-.7306802ÿÿÿ.2765577ÿÿÿÿ-2.64ÿÿÿ0.008ÿÿÿÿ-1.272723ÿÿÿ-.1886371
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿ_consÿ|ÿÿ-.9860716ÿÿÿ.1840842ÿÿÿÿ-5.36ÿÿÿ0.000ÿÿÿÿÿ-1.34687ÿÿÿ-.6252733
-------------+----------------------------------------------------------------
subjectÿÿÿÿÿÿ|
ÿÿÿvar(_cons)|ÿÿÿÿ.277968ÿÿÿ.2577448ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ.0451566ÿÿÿÿ1.711072
------------------------------------------------------------------------------
LRÿtestÿvs.ÿlogisticÿmodel:ÿchibar2(01)ÿ=ÿ1.49ÿÿÿÿÿÿÿÿProbÿ>=ÿchibar2ÿ=ÿ0.1111

.ÿ
.ÿfvsetÿbaseÿnoneÿsubjectÿgroup

.ÿbayesmhÿdtlmÿi.groupÿi.subject,ÿlikelihood(logit)ÿnoconstantÿ///
>ÿÿÿÿÿÿÿÿÿprior({dtlm:i.subject},ÿnormal({dtlm:_cons},ÿ{var_subj}))ÿ///
>ÿÿÿÿÿÿÿÿÿprior({dtlm:_cons},ÿnormal(0,ÿ5))ÿ///
>ÿÿÿÿÿÿÿÿÿprior({dtlm:i.group},ÿnormal(0,ÿ5))ÿ///
>ÿÿÿÿÿÿÿÿÿprior({var_subj},ÿigamma(0.001,ÿ0.001))ÿ///
>ÿÿÿÿÿÿÿÿÿburnin(`=75*100')ÿmcmcsize(5000)ÿ///
>ÿÿÿÿÿÿÿÿÿblock({var_subj},ÿgibbs)ÿ///
>ÿÿÿÿÿÿÿÿÿblock({dtlm:i.subj},ÿsplit)ÿ///
>ÿÿÿÿÿÿÿÿÿblock({dtlm:i.group},ÿsplit)ÿ///
>ÿÿÿÿÿÿÿÿÿblock({dtlm:_cons},ÿgibbs)ÿdotsÿnoshow({dtlm:i.subject})
ÿÿ
Burn-inÿ7500ÿaaaaaaaaa1000aaaaaaaaa2000aaaaaaaaa3000aaaaaaaaa40 00aaaaaaaaa5000aaaaaaaaa6000aaaaaaaaa7000aaaaaÿdone
Simulationÿ5000ÿ.........1000.........2000.........3000.........40 00.........5000ÿdone

Modelÿsummary
------------------------------------------------------------------------------
Likelihood:ÿ
ÿÿdtlmÿ~ÿlogit(xb_dtlm)

Priors:ÿ
ÿÿÿÿÿÿ{dtlm:i.subject}ÿ~ÿnormal({dtlm:_cons},{var_subj})ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ(1)
ÿÿ{dtlm:i.groupÿ_cons}ÿ~ÿnormal(0,5)ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ(1)

Hyperprior:ÿ
ÿÿ{var_subj}ÿ~ÿigamma(0.001,0.001)
------------------------------------------------------------------------------
(1)ÿParametersÿareÿelementsÿofÿtheÿlinearÿformÿxb_dtlm.

BayesianÿlogisticÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿMCMCÿiterationsÿÿ=ÿÿÿÿÿ12,500
Metropolis-HastingsÿandÿGibbsÿsamplingÿÿÿÿÿÿÿÿÿÿÿBurn-inÿÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿÿ7,500
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿMCMCÿsampleÿsizeÿ=ÿÿÿÿÿÿ5,000
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿ=ÿÿÿÿÿÿÿÿ677
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿAcceptanceÿrateÿÿ=ÿÿÿÿÿÿ.4376
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿEfficiency:ÿÿminÿ=ÿÿÿÿ.001535
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿavgÿ=ÿÿÿÿÿÿ.0115
Logÿmarginalÿlikelihoodÿ=ÿÿ138.38328ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmaxÿ=ÿÿÿÿÿ.01989
ÿ
------------------------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿEqual-tailed
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿMeanÿÿÿStd.ÿDev.ÿÿÿÿÿMCSEÿÿÿÿÿMedianÿÿ[95%ÿCred.ÿInterval]
-------------+----------------------------------------------------------------
dtlmÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿgroupÿ|
ÿÿÿÿÿÿÿÿÿÿ1ÿÿ|ÿÿ.7438835ÿÿÿ.1763245ÿÿÿ.018988ÿÿÿ.7481888ÿÿÿ.3887397ÿÿÿ1.070838
ÿÿÿÿÿÿÿÿÿÿ2ÿÿ|ÿÿ.6404729ÿÿÿ.1486484ÿÿÿ.016952ÿÿÿ.6435405ÿÿÿ.3380207ÿÿÿ.9313152
ÿÿÿÿÿÿÿÿÿÿ3ÿÿ|ÿÿ.0358945ÿÿÿ.2066024ÿÿÿ.020718ÿÿÿ.0449909ÿÿ-.4042884ÿÿÿ.4026351
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿ_consÿ|ÿ-1.686469ÿÿÿ.0766233ÿÿÿ.018466ÿÿ-1.702915ÿÿ-1.815309ÿÿ-1.512156
-------------+----------------------------------------------------------------
ÿÿÿÿvar_subjÿ|ÿÿ.0284327ÿÿÿ.0468534ÿÿÿ.016911ÿÿÿ.0067718ÿÿÿ.0008775ÿÿÿ.1629114
------------------------------------------------------------------------------
Note:ÿThereÿisÿaÿhighÿautocorrelationÿafterÿ500ÿlags.

.ÿ
.ÿbayesstatsÿess

EfficiencyÿsummariesÿÿÿÿMCMCÿsampleÿsizeÿ=ÿÿÿÿÿ5,000
ÿ
----------------------------------------------------
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿESSÿÿÿCorr.ÿtimeÿÿÿÿEfficiency
-------------+--------------------------------------
dtlmÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿgroupÿ|
ÿÿÿÿÿÿÿÿÿÿ1ÿÿ|ÿÿÿÿÿÿ86.23ÿÿÿÿÿÿÿÿ57.98ÿÿÿÿÿÿÿÿ0.0172
ÿÿÿÿÿÿÿÿÿÿ2ÿÿ|ÿÿÿÿÿÿ76.89ÿÿÿÿÿÿÿÿ65.03ÿÿÿÿÿÿÿÿ0.0154
ÿÿÿÿÿÿÿÿÿÿ3ÿÿ|ÿÿÿÿÿÿ99.45ÿÿÿÿÿÿÿÿ50.28ÿÿÿÿÿÿÿÿ0.0199
ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
ÿÿÿÿÿÿÿ_consÿ|ÿÿÿÿÿÿ17.22ÿÿÿÿÿÿÿ290.39ÿÿÿÿÿÿÿÿ0.0034
-------------+--------------------------------------
ÿÿÿÿvar_subjÿ|ÿÿÿÿÿÿÿ7.68ÿÿÿÿÿÿÿ651.37ÿÿÿÿÿÿÿÿ0.0015
----------------------------------------------------

.ÿ
.ÿexit

endÿofÿdo-file

.ÿ*ÿmelogitÿgroupÿ1ÿ_consÿ|ÿÿ-.9860716

.ÿdisplayÿinÿsmclÿasÿtextÿÿ-1.686469ÿ+ÿ0.7438835
-.9425855

.ÿ*ÿmelogitÿgroupÿ2

.ÿdisplayÿinÿsmclÿasÿtextÿ-0.9860716ÿ-ÿ0.1130515
-1.0991231

.ÿdisplayÿinÿsmclÿasÿtextÿÿ-1.686469ÿ+ÿ0.6404729
-1.0459961

.ÿ*ÿmelogitÿgroupÿ3

.ÿdisplayÿinÿsmclÿasÿtextÿ-0.9860716ÿ-ÿ0.7306802
-1.7167518

.ÿdisplayÿinÿsmclÿasÿtextÿÿ-1.686469ÿ+ÿ0.0358945
-1.6505745

.
1 like
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4406
#3

27 May 2015, 00:15

Sorry, forgot that it's variance in bayesmh's normal priors, and so the SDs would have been about 2¼ instead of 5, which is a little strong, and probably explains why the estimates tended to be slightly low in comparison to melogit. It probably didn't help much in moving the variance component off the peg, either. It helped with efficiency, though, if that's any consolation.

I still recommend looking into noncentered paramaterization, though.
Attached Files

Var25.txt (3.8 KB, 1 view)
Comment
skolenik

Join Date: Mar 2014

Posts: 102
#4

27 May 2015, 10:51

Thanks, Joseph, that's very helpful. Still does not solve all the issues unfortunately... and the ultimate model I am interested in is quite a bit more complicated.

-- Stas Kolenikov || http://stas.kolenikov.name
-- Principal Survey Scientist, Abt SRBI
-- Opinions stated in this post are mine only
Comment

Nikolay Balov (StataCorp)

StataCorp Employee

Join Date: Apr 2015
Posts: 37

29 May 2015, 15:17

It is true that the random-walk Metropolis is not very efficient algorithm for simulation of many conditionally independent parameters such as the the subject effects in this particular model. The Gibbs sampler is not applicable to the parameters defined in the logit likelihood and although it can be specified for the subject variance {var_subj}, it would be of little help.
For this particular data, we have a large number of subjects with only 3 observations per subject, so it would be difficult to achieve stable estimates for the individual subject effects and our goal should be to stabilize {var_subj}.

First, I prefer an alternative model that keeps the constant term in the likelihood specification and apply normal(0,{var_subj}) prior for the subject effects.

As Joseph suggested, splitting the {dtlm:i.subject} parameters indeed helps the MCMC convergence, but, unfortunately, drastically increases the simulation time. For example, the following `non-informative-prior' model may run for about 45-50 minutes.
Note that I increase the burn-in value to 10000.

Code:

set seed 123
bayesmh dtlm i.group i.subject, likelihood(logit)            ///
    prior({dtlm:i.subject}, normal(0,{var_subj}))            ///
    prior({dtlm:_cons}{dtlm:i.group},  normal(0, 100))       ///
    prior({var_subj}, igamma(0.01, 0.01))                    ///
    burnin(10000) mcmcsize(10000)                            ///
    block({dtlm:_cons}) block({var_subj})                    ///
    block({dtlm:i.subject},split) exclude({dtlm:i.subj})     ///
    block({dtlm:i.group},split) dots

Bayesian logistic regression                     MCMC iterations  =     20,000
Random-walk Metropolis-Hastings sampling         Burn-in          =     10,000
                                                 MCMC sample size =     10,000
                                                 Number of obs    =        677
                                                 Acceptance rate  =      .3817
                                                 Efficiency:  min =    .002231
                                                              avg =     .02805
Log marginal likelihood = -274.67548                          max =     .04681
 
------------------------------------------------------------------------------
             |                                                Equal-tailed
             |      Mean   Std. Dev.     MCSE     Median  [95% Cred. Interval]
-------------+----------------------------------------------------------------
dtlm         |
       group |
          2  | -.1198234   .2322944   .012103  -.1181685  -.5785233   .3312183
          3  | -.7466483   .2788367   .012888  -.7434621  -1.296931  -.2182482
             |
       _cons | -.9783426    .180101   .011099    -.97532  -1.327807   -.629271
-------------+----------------------------------------------------------------
    var_subj |  .2734181   .1907652    .04039    .220105   .0320692   .7317239
------------------------------------------------------------------------------
Note: There is a high autocorrelation after 500 lags.

Although, the `high-autocorrelation' message is still present, the MCSE errors are relatively small and the posterior mean estimates are close to the estimates obtained with melogit.

Applying stronger priors may improve the convergence and sampling efficiency. For example, we may specify normal(0,4) prior for the regression coefficients {dtlm:_cons} and {dtlm:i.group}, and igamma(40, 10) prior for {var_subj}.
The mean of the igamma(40, 10) prior is about 0.25, deliberately close the expected one. In this case we may stay with the default burn-in value, burnin(2500), and somewhat reduce the simulation time.

Code:

set seed 123
bayesmh dtlm i.group i.subject, likelihood(logit)        ///
    prior({dtlm:i.subject}, normal(0,{var_subj}))        ///
    prior({dtlm:_cons}{dtlm:i.group}, normal(0, 4))      ///
    prior({var_subj}, igamma(40, 10))                    ///
    burnin(2500) mcmcsize(10000)                         ///
    block({dtlm:i.group}{dtlm:_cons}{var_subj},split)    ///
    block({dtlm:i.subject},split) exclude({dtlm:i.subj}) ///
    init({var_subj} 1) dots

Bayesian logistic regression                     MCMC iterations  =     12,500
Random-walk Metropolis-Hastings sampling         Burn-in          =      2,500
                                                 MCMC sample size =     10,000
                                                 Number of obs    =        677
                                                 Acceptance rate  =      .4343
                                                 Efficiency:  min =     .04102
                                                              avg =      .0462
Log marginal likelihood = -469.36255                          max =     .05624
 
------------------------------------------------------------------------------
             |                                                Equal-tailed
             |      Mean   Std. Dev.     MCSE     Median  [95% Cred. Interval]
-------------+----------------------------------------------------------------
dtlm         |
       group |
          2  | -.0916963   .2265778    .01054  -.0934276  -.5216888   .3756928
          3  |  -.711456    .276167   .011645  -.7076111  -1.247206  -.1644392
             |
       _cons | -1.000955   .1760127   .008691  -.9996433  -1.364912    -.65553
-------------+----------------------------------------------------------------
    var_subj |  .2572808   .0401843   .001977     .25241   .1898934   .3461623
------------------------------------------------------------------------------

In this run we achieve an average sampling efficiency of about 5%, which is acceptable and enough to get rid of the `high-autocorrelation' message. The posterior mean estimates however, are not that close to the melogit estimates as those obtained with the `non-informative-prior' model.

We can also run the model in the original formulation with noconstant option in the likelihood and normal({dtlm:_cons},{var_subj}) prior.

Code:

set seed 123
bayesmh dtlm i.group i.subject, likelihood(logit) nocons     ///
    prior({dtlm:i.subject}, normal({dtlm:_cons},{var_subj})) ///
    prior({dtlm:_cons}{dtlm:i.group}, normal(0, 4))          ///
    prior({var_subj}, igamma(40, 10))                        ///
    burnin(2500) mcmcsize(10000)                             ///
    block({dtlm:i.group}{dtlm:_cons}{var_subj},split)        ///
    block({dtlm:i.subject},split) exclude({dtlm:i.subj})     ///
    init({var_subj} 1) dots

Bayesian logistic regression                     MCMC iterations  =     12,500
Random-walk Metropolis-Hastings sampling         Burn-in          =      2,500
                                                 MCMC sample size =     10,000
                                                 Number of obs    =        677
                                                 Acceptance rate  =      .4352
                                                 Efficiency:  min =     .00474
                                                              avg =     .01589
Log marginal likelihood = -471.21572                          max =     .04043
 
------------------------------------------------------------------------------
             |                                                Equal-tailed
             |      Mean   Std. Dev.     MCSE     Median  [95% Cred. Interval]
-------------+----------------------------------------------------------------
dtlm         |
       group |
          2  | -.0767307   .2121198    .02452  -.0708611  -.4881159   .3331182
          3  | -.6989628   .2663774   .025494  -.6925549  -1.231847  -.1830926
             |
       _cons | -1.013824   .1581895   .022977  -1.012315  -1.340148  -.7014914
-------------+----------------------------------------------------------------
    var_subj |  .2587118   .0405755   .002018   .2546826   .1910251   .3464461
------------------------------------------------------------------------------

For this model the simulation has slightly lower, but still acceptable, efficiency.

Comment

Nikolay Balov (StataCorp)

StataCorp Employee

Join Date: Apr 2015

Posts: 37
#6

30 May 2015, 09:09

Currently, the block option split cannot be applied to {dtlm:i.group}. In the above bayesmh specifications, please replace block({dtlm:i.group},split) with block({dtlm:2.group}{dtlm:3.group},split) and block({dtlm:i.group}{dtlm:_cons}{var_subj},split) with block({dtlm:2.group}{dtlm:3.group}{dtlm:_cons}{var _subj},split). Sorry about that.
Comment
skolenik

Join Date: Mar 2014

Posts: 102
#7

19 Jun 2015, 13:18

Thanks, Nikolay Balov (StataCorp) .

-- Stas Kolenikov || http://stas.kolenikov.name
-- Principal Survey Scientist, Abt SRBI
-- Opinions stated in this post are mine only
Comment

Olayiwola Adetutu

Join Date: Sep 2019
Posts: 59

22 Jan 2020, 07:56

Please house I am trying to fit one parameter, two parameter, and three parameter item response model using bayesmh using code posted in stata blog. The sample of my data is

Code:

Comment

Olayiwola Adetutu

Join Date: Sep 2019
Posts: 59

22 Jan 2020, 07:59

The code I am trying to use is

Code:

generate id = _n
quietly reshape long Q, i(id) j(item)
rename Q y
fvset base none id item
set seed 10

bayesmh y = ({discrim}*({subj:}-{dif:})), likelihood(logit)
               redefine(dif:i.item) redefine(subj:i.id)   
               prior({subj:i.id},   normal(0,1))          
               prior({discrim},   lognormal(0,1))         
               prior({dif:i.item},   normal(0,1))          
               init({discrim} 1) exclude ({subj:i.id})     
               burnin (5000) saving (sim1pl, replace)

Comment

Olayiwola Adetutu

Join Date: Sep 2019

Posts: 59
#10

22 Jan 2020, 08:01

The error message i have is

bayesmh y = ({discrim}*({subj:}-{dif:})), likelihood(logit)
invalid bayesmh specification
Comment
Olayiwola Adetutu

Join Date: Sep 2019

Posts: 59
#11

22 Jan 2020, 08:02

Please I need help
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#12

22 Jan 2020, 12:59

Originally posted by Olayiwola Adetutu View Post

The code I am trying to use is

Code:

generate id = _n quietly reshape long Q, i(id) j(item) rename Q y fvset base none id item set seed 10 bayesmh y = ({discrim}*({subj:}-{dif:})), likelihood(logit) redefine(dif:i.item) redefine(subj:i.id) prior({subj:i.id}, normal(0,1)) prior({discrim}, lognormal(0,1)) prior({dif:i.item}, normal(0,1)) init({discrim} 1) exclude ({subj:i.id}) burnin (5000) saving (sim1pl, replace)

Olayiwola,

I would suggest reading the forum FAQ. It does suggest not to tack on an unrelated question to another post - Stas was asking about Bayesian specification for a mixed logit model, whereas you are asking about the specification for IRT models. Your post would likely be missed if someone were to try to search for it.

Second, there's some advice about not excessively bumping your posts to the top. You asked for help within a minute of the prior post.

Third, many of your prior posts have been unclearly phrased. That doesn't apply here, however.

I have a feeling your question might be easily answered. The code you posted doesn't have line breaks - these are the '///'s in the Stata blog post. Without them, if you highlighted the block of code above and ran it, Stata thinks all you asked for is

Code:

bayesmh y = ({discrim}*({subj:}-{dif:})), likelihood(logit)

without any further specification of the parameters and hyperparameters. Go back and put the line breaks in your code. You will need to highlight the whole block of code and run it to get Stata to interpret it properly. Do note that you can't type the line breaks in the command line. This only works from a do file.

If this doesn't resolve your issue, please start a new post and show us the rest of the error messages.

Side note: I belatedly responded to a question you had (I'd been off the forum in a while). I hope that resolved your issue; given how you coded your data, it seems like it has.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment

Q1	Q2	Q3	Q4	Q5	Q6	Q7	Q8	Q9	Q10	Q11	Q12	Q13	Q14	Q15	Q16	Q17	Q18	Q19	Q20	Q21	Q22	Q23	Q24	Q25	Q26	Q27	Q28	Q29	Q30	Q31	Q32	Q33	Q34	Q35
0	0	1	1	1	1	1	1	1	1	1	1	1	0	1	1	1	1	0	1	1	1	1	1	1	1	1	1	1	1	1	1	0	1	0
1	1	1	1	1	1	1	0	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	0	1	1	1	1	1	1	1	1	1	0
0	0	1	1	1	1	0	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	0	1	1	1	1	1
1	1	1	0	1	0	0	0	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	0	1	0	1	1	1	1	0	0	1	0
1	0	1	1	1	1	0	0	0	0	1	0	1	1	1	1	1	1	1	1	1	1	1	1	0	1	1	1	1	1	1	0	0	1	1
0	1	1	1	1	1	1	0	0	0	1	0	1	1	1	1	0	1	0	1	0	1	1	1	0	1	0	1	1	1	0	0	0	1	0
1	1	1	1	1	0	1	0	0	1	1	1	1	0	1	1	1	1	0	1	1	1	1	0	0	1	1	1	1	1	1	1	0	1	1
1	0	1	0	1	1	0	0	0	1	1	1	1	0	0	1	0	0	0	0	0	1	1	1	1	1	0	1	0	0	1	1	0	1	0
1	1	1	0	1	0	1	0	1	0	1	1	1	0	1	1	1	1	1	1	1	1	1	1	0	1	1	1	1	1	1	1	0	1	1
0	1	1	1	1	1	0	1	0	1	1	1	1	1	1	1	1	1	0	1	0	1	1	1	0	1	1	1	1	0	1	1	1	1	1
1	0	1	1	1	0	0	1	0	0	1	1	1	1	1	1	1	1	1	1	1	0	1	1	1	0	1	1	1	1	1	1	0	1	1
0	0	1	0	1	1	1	0	1	0	1	1	1	1	1	1	0	1	0	1	1	1	1	0	0	1	1	1	1	1	0	1	0	1	1
1	0	1	0	1	1	0	0	0	0	1	1	1	0	0	1	1	0	1	1	1	1	1	1	1	1	0	1	1	1	1	1	0	1	1
0	0	1	1	1	1	1	1	1	1	1	1	1	1	1	1	0	1	0	1	1	1	1	1	1	1	1	0	1	1	1	0	1	1	0
0	1	1	1	1	0	0	0	1	1	1	1	0	1	1	1	0	0	0	1	1	1	1	0	0	0	0	1	1	0	1	0	1	1	1
0	1	1	1	1	1	0	1	1	1	1	1	1	1	1	1	1	1	1	1	1	0	1	1	0	1	1	1	1	1	1	1	0	1	1
0	1	1	1	1	1	0	0	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	0	0	1	1	1	1	1	1	1	1
1	0	1	1	1	0	0	0	0	0	1	1	1	1	1	1	0	1	1	1	1	1	1	0	0	1	0	1	1	1	1	0	1	1	1
0	0	1	1	1	1	0	0	1	0	1	1	1	1	1	1	0	1	1	1	1	1	1	0	0	0	0	0	1	1	1	0	0	1	1
0	0	1	1	0	1	1	0	1	0	1	1	0	1	1	1	0	1	0	1	1	1	1	1	1	1	1	1	1	0	0	0	0	1	1
0	0	1	1	1	1	1	0	0	1	1	1	1	1	1	1	1	1	0	1	1	1	1	1	0	1	1	0	1	1	1	1	0	1	1
1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	1	0	1	0	1	1	1	1	1	1	1	1	1	1	1
1	1	1	1	1	1	1	0	0	0	1	1	1	0	1	1	1	1	0	1	1	1	1	0	0	0	1	1	1	1	1	1	1	1	1

Announcement