Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • bayes, ... mixed: ... crashes for large burn-in periods

    Dear Statalisters,

    I would like to point out a problem that occurs when running various rather complex Bayesian multilevel models using the bayes prefix. The problem occurs in both Stata 17/MP and Stata 18/MP for Unix (Linux 64-bit x86-64). A test data set for such a case is attached. Here I try to run a crossed effects model with two ('fixed') covariates, two crossed effects and random coefficients ('slopes') in both levels (using Isabel’s suggestions presented here Including covariates in crossed-effects models).
    Here’s the code I’m using
    Code:
    *--------------------------------------------------*
    cap log close
    log using investigate_bayes_mixed, replace
    use test_data, clear
     
    *Including covariates in crossed-effects models as described by Isabel Canette
    *https://blog.stata.com/2010/12/22/including-covariates-in-crossed-effects-models/
     
    qui tab Group2, gen(id_Group2)
    unab idvar: id_Group2*
    foreach v of local idvar {
        gen inter`v' =covariate2*`v'
    }
     
    *--------------------------------------------------*
    *Standard frequentist model
    mixed outcome covariate1 covariate2 || _all: R.Group2 || _all: inter*, cov(identity) nocons || Group1: covariate2
     
    *--------------------------------------------------*
     
    bayes, burnin(200000) dots(1000)  rseed(42) saving(investigate_bayes_mixed_simulation, replace): ///
        mixed outcome covariate1 covariate2 || _all: R.Group2 || _all: inter*, cov(identity) nocons || Group1: covariate2
        estimates save investigate_bayes_mixed, replace
     
    *--------------------------------------------------*
    log close
    The code crashes Stata after estimating the model, i.e. Stata exits without saving the estimates, here is the log-file.
    Code:
    . use test_data, clear
     
    .
    . *Including covariates in crossed-effects models as described by Isabel Canette
    . *https://blog.stata.com/2010/12/22/including-covariates-in-crossed-effects-models/
    .
    . qui tab Group2, gen(id_Group2)
     
    . unab idvar: id_Group2*
     
    . foreach v of local idvar {
      2.     gen inter`v' =covariate2*`v'
      3. }
     
    .
    . *--------------------------------------------------*
    . *Standard frequentist model
    . mixed outcome covariate1 covariate2 || _all: R.Group2 || _all: inter*, cov(identity) nocons || Group1: covariate2
     
    Performing EM optimization ...
     
    Performing gradient-based optimization:
    Iteration 0:  Log likelihood =  37.598276
    Iteration 1:  Log likelihood =  38.546812
    Iteration 2:  Log likelihood =  38.905076
    Iteration 3:  Log likelihood =  38.952799
    Iteration 4:  Log likelihood =  38.954077
    Iteration 5:  Log likelihood =  38.954091
    Iteration 6:  Log likelihood =  38.954091
     
    Computing standard errors ...
     
    Mixed-effects ML regression                             Number of obs =     97
     
            Grouping information
            -------------------------------------------------------------
                            |     No. of       Observations per group
             Group variable |     groups    Minimum    Average    Maximum
            ----------------+--------------------------------------------
                       _all |          1         97       97.0         97
                     Group1 |         24          1        4.0         20
            -------------------------------------------------------------
     
                                                            Wald chi2(2)  =  14.75
    Log likelihood =  38.954091                             Prob > chi2   = 0.0006
     
    ------------------------------------------------------------------------------
         outcome | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
      covariate1 |  -.0130288    .004734    -2.75   0.006    -.0223072   -.0037503
      covariate2 |  -.2171314    .066152    -3.28   0.001    -.3467869   -.0874759
           _cons |   .7379007   .0734132    10.05   0.000     .5940134     .881788
    ------------------------------------------------------------------------------
     
    ------------------------------------------------------------------------------
      Random-effects parameters  |   Estimate   Std. err.     [95% conf. interval]
    -----------------------------+------------------------------------------------
    _all: Identity               |
                   var(R.Group2) |   .0233182   .0101985      .0098949    .0549513
    -----------------------------+------------------------------------------------
    _all: Identity               |
      var(inter~21..inte~219)(1) |   2.66e-11          .             .           .
    -----------------------------+------------------------------------------------
    Group1: Independent          |
                 var(covariate2) |   .0042044   .0217997      1.62e-07    108.9365
                      var(_cons) |   1.68e-16   1.27e-12             0           .
    -----------------------------+------------------------------------------------
                   var(Residual) |   .0187106   .0032308      .0133385    .0262462
    ------------------------------------------------------------------------------
    LR test vs. linear model: chi2(4) = 49.66                 Prob > chi2 = 0.0000
    (1) interid_Group21 interid_Group22 interid_Group23 interid_Group24 interid_Group25 interid_Group26 interid_Group27 interid_Group28 interid_Group29 interid_Group210 interid_Group211 interid_Group212 interid_Group213
        interid_Group214 interid_Group215 interid_Group216 interid_Group217 interid_Group218 interid_Group219
     
    Note: LR test is conservative and provided only for reference.
     
    .
    . *--------------------------------------------------*
    . *Bayesian model, with burnin(200000)
    . bayes, burnin(200000) dots(1000)  rseed(42) saving(investigate_bayes_mixed_simulation, replace): ///
    >     mixed outcome covariate1 covariate2 || _all: R.Group2 || _all: inter*, cov(identity) nocons || Group1: covariate2
    note: Gibbs sampling is used for regression coefficients and some variance components.
     
    Burn-in 200000 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa done
    Simulation 10000 .......... done
     
    file investigate_bayes_mixed_simulation.dta not found; file saved.
     
    Multilevel structure
    ------------------------------------------------------------------------------
    Group2
        {U0}: random intercepts
     
    _all
        {V0}: random intercepts
        {V1}: random coefficients for interid_Group21
        {V2}: random coefficients for interid_Group22
        {V3}: random coefficients for interid_Group23
        {V4}: random coefficients for interid_Group24
        {V5}: random coefficients for interid_Group25
        {V6}: random coefficients for interid_Group26
        {V7}: random coefficients for interid_Group27
        {V8}: random coefficients for interid_Group28
        {V9}: random coefficients for interid_Group29
        {V10}: random coefficients for interid_Group210
        {V11}: random coefficients for interid_Group211
        {V12}: random coefficients for interid_Group212
        {V13}: random coefficients for interid_Group213
        {V14}: random coefficients for interid_Group214
        {V15}: random coefficients for interid_Group215
        {V16}: random coefficients for interid_Group216
        {V17}: random coefficients for interid_Group217
        {V18}: random coefficients for interid_Group218
        {V19}: random coefficients for interid_Group219
     
    Group1
        {W0}: random intercepts
        {W1}: random coefficients for covariate2
    ------------------------------------------------------------------------------
     
    Model summary
    ------------------------------------------------------------------------------
    Likelihood:
      outcome ~ normal(xb_outcome,{e.outcome:sigma2})
     
    Priors:
      {outcome:covariate1 covariate2 _cons} ~ normal(0,10000)                  (1)
                                       {U0} ~ normal(0,{U:sigma2})             (1)
                                       {V0} ~ normal(0,{V:sigma2})             (1)
                                       {V1} ~ normal(0,{V:sigma2})             (1)
                                       {V2} ~ normal(0,{V:sigma2})             (1)
                                       {V3} ~ normal(0,{V:sigma2})             (1)
                                       {V4} ~ normal(0,{V:sigma2})             (1)
                                       {V5} ~ normal(0,{V:sigma2})             (1)
                                       {V6} ~ normal(0,{V:sigma2})             (1)
                                       {V7} ~ normal(0,{V:sigma2})             (1)
                                       {V8} ~ normal(0,{V:sigma2})             (1)
                                       {V9} ~ normal(0,{V:sigma2})             (1)
                                      {V10} ~ normal(0,{V:sigma2})             (1)
                                      {V11} ~ normal(0,{V:sigma2})             (1)
                                      {V12} ~ normal(0,{V:sigma2})             (1)
                                      {V13} ~ normal(0,{V:sigma2})             (1)
                                      {V14} ~ normal(0,{V:sigma2})             (1)
                                      {V15} ~ normal(0,{V:sigma2})             (1)
                                      {V16} ~ normal(0,{V:sigma2})             (1)
                                      {V17} ~ normal(0,{V:sigma2})             (1)
                                      {V18} ~ normal(0,{V:sigma2})             (1)
                                      {V19} ~ normal(0,{V19:sigma2})           (1)
                                       {W0} ~ normal(0,{W0:sigma2})            (1)
                                       {W1} ~ normal(0,{W1:sigma2})            (1)
                         {e.outcome:sigma2} ~ igamma(.01,.01)
     
    Hyperpriors:
        {U:sigma2} ~ igamma(.01,.01)
        {V:sigma2} ~ igamma(.01,.01)
      {V19:sigma2} ~ igamma(.01,.01)
       {W0:sigma2} ~ igamma(.01,.01)
       {W1:sigma2} ~ igamma(.01,.01)
    ------------------------------------------------------------------------------
    (1) Parameters are elements of the linear form xb_outcome.
     
    Bayesian multilevel regression                   MCMC iterations  =    210,000
    Metropolis–Hastings and Gibbs sampling           Burn-in          =    200,000
                                                     MCMC sample size =     10,000
     
    -------------------------------------------------------------
                    |     No. of       Observations per group
     Group variable |     groups    Minimum    Average    Maximum
    ----------------+--------------------------------------------
               _all |          1         97       97.0         97
             Group1 |          1         97       97.0         97
    -------------------------------------------------------------
     
                                                     Number of obs    =         97
                                                     Acceptance rate  =      .5045
                                                     Efficiency:  min =    .002255
                                                                  avg =     .06496
    Log marginal-likelihood                                       max =      .2841
     
    ------------------------------------------------------------------------------
                 |                                                Equal-tailed
                 |      Mean   Std. dev.     MCSE     Median  [95% cred. interval]
    -------------+----------------------------------------------------------------
    outcome      |
      covariate1 | -.0110874   .0059492   .000344  -.0109953  -.0228833   .0004529
      covariate2 | -.1876556   .1218701   .006668  -.1864473  -.4312361     .05925
           _cons |  .6759434   .1800871   .037924   .6768317   .3181374   1.030745
    -------------+----------------------------------------------------------------
    Group2       |
        U:sigma2 |  .0267817   .0142745   .001079   .0236498   .0091008   .0632229
    -------------+----------------------------------------------------------------
    _all         |
        V:sigma2 |  .0328675   .0291046   .003738   .0227667   .0049685   .1147533
      V19:sigma2 |  648.4501   21147.34   396.721   .0976229   .0059268   45.86536
    -------------+----------------------------------------------------------------
    Group1       |
       W0:sigma2 |  .0076228    .005156   .000306   .0063366   .0020867   .0204744
       W1:sigma2 |  .0304866   .0299592   .002302    .020928   .0038986   .1133497
    -------------+----------------------------------------------------------------
    e.outcome    |
          sigma2 |   .019182   .0033781   .000083   .0188363   .0135053   .0268277
    ------------------------------------------------------------------------------
    Note: Default priors are used for model parameters.
    Note: There is a high autocorrelation after 500 lags.
    My guess is that the problem must have to do with the size of the burn-in period: I have tried
    burnin(20000), burnin(100000) and burnin(150000).
    For all three sizes, everything works fine.
    Has anyone else encountered similar problems and successfully fixed them?


    Just in case: at the moment, I’m using Nikolay’s workaround to retrieve estimation results (described here Retrieving estimation results from "bayes [...] saving(xyz)" command):
    Code:
    .
    . *Make temporary directory to store simulation dataset
    . capture mkdir "temp_sim"
     
    .
    . copy "investigate_bayes_mixed_simulation.dta" "temp_sim/investigate_bayes_mixed_simulation.dta"
     
    .
    . *Re-run Bayesian model with burnin(1)
    . use test_data, clear
     
    .
    .
    . qui tab Group2, gen(id_Group2)
     
    . unab idvar: id_Group2*
     
    . foreach v of local idvar {
      2.     gen inter`v' =covariate2*`v'
      3. }
     
    .
    .
    . *--------------------------------------------------*
    . bayes, burnin(1) rseed(42) saving(investigate_bayes_mixed_simulation, replace): ///
    >     mixed outcome covariate1 covariate2 || _all: R.Group2 || _all: inter*, cov(identity) nocons || Group1: covariate2
    note: Gibbs sampling is used for regression coefficients and some variance components.
     
    Burn-in 1  done
    Simulation 10000 aaaaaaaaa1000aaaaaaaaa2000aaaaa....3000.........4000.........5000.........6000.........7000.........8000.........9000.........10000 done
     
    file investigate_bayes_mixed_simulation.dta saved.
     
    Multilevel structure
    ------------------------------------------------------------------------------
    Group2
        {U0}: random intercepts
     
    _all
        {V0}: random intercepts
        {V1}: random coefficients for interid_Group21
        {V2}: random coefficients for interid_Group22
        {V3}: random coefficients for interid_Group23
        {V4}: random coefficients for interid_Group24
        {V5}: random coefficients for interid_Group25
        {V6}: random coefficients for interid_Group26
        {V7}: random coefficients for interid_Group27
        {V8}: random coefficients for interid_Group28
        {V9}: random coefficients for interid_Group29
        {V10}: random coefficients for interid_Group210
        {V11}: random coefficients for interid_Group211
        {V12}: random coefficients for interid_Group212
        {V13}: random coefficients for interid_Group213
        {V14}: random coefficients for interid_Group214
        {V15}: random coefficients for interid_Group215
        {V16}: random coefficients for interid_Group216
        {V17}: random coefficients for interid_Group217
        {V18}: random coefficients for interid_Group218
        {V19}: random coefficients for interid_Group219
     
    Group1
        {W0}: random intercepts
        {W1}: random coefficients for covariate2
    ------------------------------------------------------------------------------
     
    Model summary
    ------------------------------------------------------------------------------
    Likelihood:
      outcome ~ normal(xb_outcome,{e.outcome:sigma2})
     
    Priors:
      {outcome:covariate1 covariate2 _cons} ~ normal(0,10000)                  (1)
                                       {U0} ~ normal(0,{U:sigma2})             (1)
                                       {V0} ~ normal(0,{V:sigma2})             (1)
                                       {V1} ~ normal(0,{V:sigma2})             (1)
                                       {V2} ~ normal(0,{V:sigma2})             (1)
                                       {V3} ~ normal(0,{V:sigma2})             (1)
                                       {V4} ~ normal(0,{V:sigma2})             (1)
                                       {V5} ~ normal(0,{V:sigma2})             (1)
                                       {V6} ~ normal(0,{V:sigma2})             (1)
                                       {V7} ~ normal(0,{V:sigma2})             (1)
                                       {V8} ~ normal(0,{V:sigma2})             (1)
                                       {V9} ~ normal(0,{V:sigma2})             (1)
                                      {V10} ~ normal(0,{V:sigma2})             (1)
                                      {V11} ~ normal(0,{V:sigma2})             (1)
                                      {V12} ~ normal(0,{V:sigma2})             (1)
                                      {V13} ~ normal(0,{V:sigma2})             (1)
                                      {V14} ~ normal(0,{V:sigma2})             (1)
                                      {V15} ~ normal(0,{V:sigma2})             (1)
                                      {V16} ~ normal(0,{V:sigma2})             (1)
                                      {V17} ~ normal(0,{V:sigma2})             (1)
                                      {V18} ~ normal(0,{V:sigma2})             (1)
                                      {V19} ~ normal(0,{V19:sigma2})           (1)
                                       {W0} ~ normal(0,{W0:sigma2})            (1)
                                       {W1} ~ normal(0,{W1:sigma2})            (1)
                         {e.outcome:sigma2} ~ igamma(.01,.01)
     
    Hyperpriors:
        {U:sigma2} ~ igamma(.01,.01)
        {V:sigma2} ~ igamma(.01,.01)
      {V19:sigma2} ~ igamma(.01,.01)
       {W0:sigma2} ~ igamma(.01,.01)
       {W1:sigma2} ~ igamma(.01,.01)
    ------------------------------------------------------------------------------
    (1) Parameters are elements of the linear form xb_outcome.
     
    Bayesian multilevel regression                   MCMC iterations  =     10,001
    Metropolis–Hastings and Gibbs sampling           Burn-in          =          1
                                                     MCMC sample size =     10,000
     
    -------------------------------------------------------------
                    |     No. of       Observations per group
     Group variable |     groups    Minimum    Average    Maximum
    ----------------+--------------------------------------------
               _all |          1         97       97.0         97
             Group1 |          1         97       97.0         97
    -------------------------------------------------------------
     
                                                     Number of obs    =         97
                                                     Acceptance rate  =      .5247
                                                     Efficiency:  min =    .001728
                                                                  avg =      .1486
    Log marginal-likelihood                                       max =          1
     
    ------------------------------------------------------------------------------
                 |                                                Equal-tailed
                 |      Mean   Std. dev.     MCSE     Median  [95% cred. interval]
    -------------+----------------------------------------------------------------
    outcome      |
      covariate1 |  -.010756   .0057898   .000262  -.0107464   -.022353   .0008853
      covariate2 | -.1865147   .1113828   .005497  -.1863413  -.3999516   .0360584
           _cons |   .589891   .2360603    .05678   .6385582  -.1050936   .9248694
    -------------+----------------------------------------------------------------
    Group2       |
        U:sigma2 |  .0252251   .0130876   .000589   .0224263   .0083923   .0581508
    -------------+----------------------------------------------------------------
    _all         |
        V:sigma2 |  .0413304   .0492257    .00763   .0251147   .0051311   .1700203
      V19:sigma2 |  363.2888    14164.7   141.647   .1094709   .0061633    61.7817
    -------------+----------------------------------------------------------------
    Group1       |
       W0:sigma2 |  .0078951   .0050638   .000273   .0065598   .0022338   .0212309
       W1:sigma2 |  .0275205    .027656   .002792   .0185549   .0036315   .1026532
    -------------+----------------------------------------------------------------
    e.outcome    |
          sigma2 |  .0192395   .0035545   .000092   .0187985   .0135397   .0274202
    ------------------------------------------------------------------------------
    Note: Default priors are used for model parameters.
    Note: Adaptation continues during simulation.
    Note: There is a high autocorrelation after 500 lags.
    Note: Adaptation tolerance is not met in at least one of the blocks.
     
    .     estimates save investigate_bayes_mixed, replace
    file investigate_bayes_mixed.ster saved
     
    .
    . *--------------------------------------------------*
    .
    . *Copy simulation dataset back to current directory
    . copy "temp_sim/investigate_bayes_mixed_simulation.dta" "investigate_bayes_mixed_simulation.dta", replace
     
    .
    . *Re-run model based on initial run
    .
    . bayes
     
    Multilevel structure
    ------------------------------------------------------------------------------
    Group2
        {U0}: random intercepts
     
    _all
        {V0}: random intercepts
        {V1}: random coefficients for interid_Group21
        {V2}: random coefficients for interid_Group22
        {V3}: random coefficients for interid_Group23
        {V4}: random coefficients for interid_Group24
        {V5}: random coefficients for interid_Group25
        {V6}: random coefficients for interid_Group26
        {V7}: random coefficients for interid_Group27
        {V8}: random coefficients for interid_Group28
        {V9}: random coefficients for interid_Group29
        {V10}: random coefficients for interid_Group210
        {V11}: random coefficients for interid_Group211
        {V12}: random coefficients for interid_Group212
        {V13}: random coefficients for interid_Group213
        {V14}: random coefficients for interid_Group214
        {V15}: random coefficients for interid_Group215
        {V16}: random coefficients for interid_Group216
        {V17}: random coefficients for interid_Group217
        {V18}: random coefficients for interid_Group218
        {V19}: random coefficients for interid_Group219
     
    Group1
        {W0}: random intercepts
        {W1}: random coefficients for covariate2
    ------------------------------------------------------------------------------
     
    Model summary
    ------------------------------------------------------------------------------
    Likelihood:
      outcome ~ normal(xb_outcome,{e.outcome:sigma2})
     
    Priors:
      {outcome:covariate1 covariate2 _cons} ~ normal(0,10000)                  (1)
                                       {U0} ~ normal(0,{U:sigma2})             (1)
                                       {V0} ~ normal(0,{V:sigma2})             (1)
                                       {V1} ~ normal(0,{V:sigma2})             (1)
                                       {V2} ~ normal(0,{V:sigma2})             (1)
                                       {V3} ~ normal(0,{V:sigma2})             (1)
                                       {V4} ~ normal(0,{V:sigma2})             (1)
                                       {V5} ~ normal(0,{V:sigma2})             (1)
                                       {V6} ~ normal(0,{V:sigma2})             (1)
                                       {V7} ~ normal(0,{V:sigma2})             (1)
                                       {V8} ~ normal(0,{V:sigma2})             (1)
                                       {V9} ~ normal(0,{V:sigma2})             (1)
                                      {V10} ~ normal(0,{V:sigma2})             (1)
                                      {V11} ~ normal(0,{V:sigma2})             (1)
                                      {V12} ~ normal(0,{V:sigma2})             (1)
                                      {V13} ~ normal(0,{V:sigma2})             (1)
                                      {V14} ~ normal(0,{V:sigma2})             (1)
                                      {V15} ~ normal(0,{V:sigma2})             (1)
                                      {V16} ~ normal(0,{V:sigma2})             (1)
                                      {V17} ~ normal(0,{V:sigma2})             (1)
                                      {V18} ~ normal(0,{V:sigma2})             (1)
                                      {V19} ~ normal(0,{V19:sigma2})           (1)
                                       {W0} ~ normal(0,{W0:sigma2})            (1)
                                       {W1} ~ normal(0,{W1:sigma2})            (1)
                         {e.outcome:sigma2} ~ igamma(.01,.01)
     
    Hyperpriors:
        {U:sigma2} ~ igamma(.01,.01)
        {V:sigma2} ~ igamma(.01,.01)
      {V19:sigma2} ~ igamma(.01,.01)
       {W0:sigma2} ~ igamma(.01,.01)
       {W1:sigma2} ~ igamma(.01,.01)
    ------------------------------------------------------------------------------
    (1) Parameters are elements of the linear form xb_outcome.
     
    Bayesian multilevel regression                   MCMC iterations  =     10,001
    Metropolis–Hastings and Gibbs sampling           Burn-in          =          1
                                                     MCMC sample size =     10,000
     
    -------------------------------------------------------------
                    |     No. of       Observations per group
     Group variable |     groups    Minimum    Average    Maximum
    ----------------+--------------------------------------------
               _all |          1         97       97.0         97
             Group1 |          1         97       97.0         97
    -------------------------------------------------------------
     
                                                     Number of obs    =         97
                                                     Acceptance rate  =      .5247
                                                     Efficiency:  min =    .002255
                                                                  avg =     .06496
    Log marginal-likelihood                                       max =      .2841
     
    ------------------------------------------------------------------------------
                 |                                                Equal-tailed
                 |      Mean   Std. dev.     MCSE     Median  [95% cred. interval]
    -------------+----------------------------------------------------------------
    outcome      |
      covariate1 | -.0110874   .0059492   .000344  -.0109953  -.0228833   .0004529
      covariate2 | -.1876556   .1218701   .006668  -.1864473  -.4312361     .05925
           _cons |  .6759434   .1800871   .037924   .6768317   .3181374   1.030745
    -------------+----------------------------------------------------------------
    Group2       |
        U:sigma2 |  .0267817   .0142745   .001079   .0236498   .0091008   .0632229
    -------------+----------------------------------------------------------------
    _all         |
        V:sigma2 |  .0328675   .0291046   .003738   .0227667   .0049685   .1147533
      V19:sigma2 |  648.4501   21147.34   396.721   .0976229   .0059268   45.86536
    -------------+----------------------------------------------------------------
    Group1       |
       W0:sigma2 |  .0076228    .005156   .000306   .0063366   .0020867   .0204744
       W1:sigma2 |  .0304866   .0299592   .002302    .020928   .0038986   .1133497
    -------------+----------------------------------------------------------------
    e.outcome    |
          sigma2 |   .019182   .0033781   .000083   .0188363   .0135053   .0268277
    ------------------------------------------------------------------------------
    Note: Default priors are used for model parameters.
    Note: There is a high autocorrelation after 500 lags.
     
    .
    . log close
    Best,
    Alex
    Attached Files
    Last edited by Alexander Koplenig; 30 May 2023, 01:16.

  • #2
    PS: Further tests show that Stata doesn't crash with a burn-in period of 200,000 if any of the random coefficients are removed, e.g. running
    Code:
    mixed outcome covariate1 covariate2 || _all: R.Group2 || _all: inter*, cov(identity) nocons || Group1:
    or
    Code:
    mixed outcome covariate1 covariate2 || _all: R.Group2 || Group1: covariate
    instead of
    Code:
    mixed outcome covariate1 covariate2 || _all: R.Group2 || _all: inter*, cov(identity) nocons || Group1: covariate2
    doesn't lead to a crash.

    Comment


    • #3
      Originally posted by Alexander Koplenig View Post
      PS: Further tests show that Stata doesn't crash with a burn-in period of 200,000 if any of the random coefficients are removed . . .
      I take that you really do intend to have the same covariate as the random slope for both grouping factors.

      It's not entirely clear why V19:sigma2 is broken out from all of the other V's variances, which are pooled under V:sigma2. I suppose that it has to do with the linear dependence of the parameters, coupled with the noconstant option, but its value seems anomalous, poorly estimated and grossly inconsistent with the maximum likelihood estimate (which has collapsed to zero). You might want to look into constraining V19's to be equal to that pooled for the others.

      Have you considered weakly informative priors, say, N(0, 1), for all parameters (variances included)? That might obviate the need for extraordinary burn-in. Stata's defaults are really just placeholders, I think.

      Comment


      • #4
        Thanks Joseph for your reply.

        I probably should have made it clearer that the test dataset is just a sample of a larger dataset I'm working with and that I wanted to keep the model specification to the bare minimum for illustration purposes, so I left out everything from choosing different priors (thanks for the valuable link!) to simulating more than one chain.

        The problem I wanted to point out is that for such models, on the machine I'm working on, everything works fine for shorter burn-in periods (here 150,000 and below), but for longer burn-in periods (here 200,000 and above) Stata crashes. By "crashes" I mean that Stata exits without any error message.
        Thanks again,
        Alex
        Last edited by Alexander Koplenig; 30 May 2023, 23:47.

        Comment


        • #5
          Yeah, it's probably a bug, but an extreme edge case: burn-ins of hundreds of thousands shouldn't be necessary in any model, and long before reaching that point I'd have stepped back and reassessed things.

          The bigger problem to me is how Stata's handling the hyperparameterization of the Vs. It doesn't make sense to me that Stata is factoring V19:sigma2 out from V:sigma2, and when it does it gives rise to nonsensical results. If I were to report anything to StataCorp, it would be that.

          Comment


          • #6
            Good point.
            The following example shows that the same happens when using bayes: for Isabel’s original example.
            Once again, the parameterizations of the Vs seems strange
            Code:
            .
            . *Get data via: net from http://www.stata-press.com/data/mlmus2/
            .
            . use homework, clear
             
            .
            . tab region, gen(id_reg)
             
                 region |      Freq.     Percent        Cum.
            ------------+-----------------------------------
                      1 |        100       19.27       19.27
                      2 |        254       48.94       68.21
                      3 |        165       31.79      100.00
            ------------+-----------------------------------
                  Total |        519      100.00
             
            .
            . unab idvar: id_reg*
             
            . foreach v of local idvar{
              2.     gen inter`v' = meanses*`v'
              3. }
             
            .
            . *--------------------------------------------*
            . *Run frequentist mixed model
            . mixed math meanses parented public || _all: R.region ||  _all:inter*, cov(identity) nocons || urban: public
             
            Performing EM optimization ...
             
            Performing gradient-based optimization:
            Iteration 0:  Log likelihood = -1871.5973 
            Iteration 1:  Log likelihood = -1871.5973 
             
            Computing standard errors ...
             
            Mixed-effects ML regression                             Number of obs =    519
             
                    Grouping information
                    -------------------------------------------------------------
                                    |     No. of       Observations per group
                     Group variable |     groups    Minimum    Average    Maximum
                    ----------------+--------------------------------------------
                               _all |          1        519      519.0        519
                              urban |          3        122      173.0        242
                    -------------------------------------------------------------
             
                                                                    Wald chi2(3)  =  66.24
            Log likelihood = -1871.5973                             Prob > chi2   = 0.0000
             
            ------------------------------------------------------------------------------
                    math | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
            -------------+----------------------------------------------------------------
                 meanses |   3.243719   1.749818     1.85   0.064    -.1858614    6.673299
                parented |   2.536147   .3635308     6.98   0.000      1.82364    3.248654
                  public |   1.456702   2.323287     0.63   0.531    -3.096857     6.01026
                   _cons |   42.12556   2.868475    14.69   0.000     36.50345    47.74766
            ------------------------------------------------------------------------------
             
            ------------------------------------------------------------------------------
              Random-effects parameters  |   Estimate   Std. err.     [95% conf. interval]
            -----------------------------+------------------------------------------------
            _all: Identity               |
                           var(R.region) |   5.905095   6.292358      .7314804    47.67066
            -----------------------------+------------------------------------------------
            _all: Identity               |
              var(interi~1..interi~3)(1) |   3.910939   5.800271      .2137338    71.56305
            -----------------------------+------------------------------------------------
            urban: Independent           |
                             var(public) |   10.70444   11.65825      1.266243    90.49221
                              var(_cons) |    11.6372   12.13898      1.506416    89.89848
            -----------------------------+------------------------------------------------
                           var(Residual) |   75.94031   4.780382      67.12587     85.9122
            ------------------------------------------------------------------------------
            LR test vs. linear model: chi2(4) = 21.77                 Prob > chi2 = 0.0002
            (1) interid_reg1 interid_reg2 interid_reg3
             
            Note: LR test is conservative and provided only for reference.
             
            .
            . *--------------------------------------------*
            . *Run bayesian mixed model
            . bayes, burnin(10000) dots(1000) rseed(42) saving(test_simulation, replace): mixed math parented meanses public || _all: R.region ||  _all:inter*, cov(identity) nocons || urban: public
            note: Gibbs sampling is used for regression coefficients and some variance components.
             
            Burn-in 10000 aaaaaaaaaa done
            Simulation 10000 .......... done
             
            file test_simulation.dta not found; file saved.
             
            Multilevel structure
            ------------------------------------------------------------------------------
            region
                {U0}: random intercepts
             
            _all
                {V0}: random intercepts
                {V1}: random coefficients for interid_reg1
                {V2}: random coefficients for interid_reg2
                {V3}: random coefficients for interid_reg3
             
            urban
                {W0}: random intercepts
                {W1}: random coefficients for public
            ------------------------------------------------------------------------------
             
            Model summary
            ------------------------------------------------------------------------------
            Likelihood:
              math ~ normal(xb_math,{e.math:sigma2})
             
            Priors:
              {math:parented meanses public _cons} ~ normal(0,10000)                   (1)
                                              {U0} ~ normal(0,{U:sigma2})              (1)
                                        {V0 V1 V2} ~ normal(0,{V:sigma2})              (1)
                                              {V3} ~ normal(0,{V3:sigma2})             (1)
                                              {W0} ~ normal(0,{W0:sigma2})             (1)
                                              {W1} ~ normal(0,{W1:sigma2})             (1)
                                   {e.math:sigma2} ~ igamma(.01,.01)
             
            Hyperpriors:
               {U:sigma2} ~ igamma(.01,.01)
               {V:sigma2} ~ igamma(.01,.01)
              {V3:sigma2} ~ igamma(.01,.01)
              {W0:sigma2} ~ igamma(.01,.01)
              {W1:sigma2} ~ igamma(.01,.01)
            ------------------------------------------------------------------------------
            (1) Parameters are elements of the linear form xb_math.
             
            Bayesian multilevel regression                   MCMC iterations  =     20,000
            Metropolis–Hastings and Gibbs sampling           Burn-in          =     10,000
                                                             MCMC sample size =     10,000
             
            -------------------------------------------------------------
                            |     No. of       Observations per group
             Group variable |     groups    Minimum    Average    Maximum
            ----------------+--------------------------------------------
                       _all |          1        519      519.0        519
                      urban |          1        519      519.0        519
            -------------------------------------------------------------
             
                                                             Number of obs    =        519
                                                             Acceptance rate  =      .6513
                                                             Efficiency:  min =    .001324
                                                                          avg =      .2068
            Log marginal-likelihood                                       max =      .8525
             
            ------------------------------------------------------------------------------
                         |                                                Equal-tailed
                         |      Mean   Std. dev.     MCSE     Median  [95% cred. interval]
            -------------+----------------------------------------------------------------
            math         |
                parented |  2.541958   .3708861   .004017   2.544392   1.822191   3.270409
                 meanses |  1.348668   4.698861   1.00379   1.995458    -11.182   8.596084
                  public |  .0095209   12.89427   3.45928   1.249603  -30.65934   25.40547
                   _cons |  36.88891   14.84543    4.0792   37.47589   9.191595   62.45532
            -------------+----------------------------------------------------------------
            region       |
                U:sigma2 |  198.0477   868.0009    55.214    37.0407   1.510005   1280.511
            -------------+----------------------------------------------------------------
            _all         |
                V:sigma2 |  56.34161   81.79127   20.4278   14.34443   .3528027   282.8328
               V3:sigma2 |    616315   2.44e+07    383397   5.581477   .0124682    8683.06
            -------------+----------------------------------------------------------------
            urban        |
               W0:sigma2 |   223.604   1060.853   62.7861   48.30993   1.778735   1303.457
               W1:sigma2 |  471.6207   1965.668   162.069   90.02992   1.610921   3102.505
            -------------+----------------------------------------------------------------
            e.math       |
                  sigma2 |  78.48072   6.199874   .072292   77.83189    68.3175   92.34401
            ------------------------------------------------------------------------------
            Note: Default priors are used for model parameters.
            Note: There is a high autocorrelation after 500 lags.
             
            . estimates save test_bayes, replace
            file test_bayes.ster saved
             
            . log close

            Comment


            • #7
              Originally posted by Alexander Koplenig View Post
              The following example shows that the same happens when using bayes: for Isabel’s original example.
              Once again, the parameterizations of the Vs seems strange
              Agreed. It's spewing junk.

              I recommend contacting StataCorp's technical support about this.

              Comment


              • #8
                Will do!
                Out of curiosity, I tried out your N(0, 1) suggestion for the priors. Below are the results for a burn-in period of 10,000 and 100,000.
                10,000
                Code:
                . bayes, prior({math:} {V:sigma2} {U:sigma2} {V3:sigma2} {W0:sigma2} {W1:sigma2} , normal(0,1)) burnin(10000) dots(1000) rseed(42) saving(test_simulation, replace): mixed math parented meanses public || _all: R.region ||  _a
                > ll:inter*, cov(identity) nocons || urban: public
                note: Gibbs sampling is used for some variance components.
                 
                Burn-in 10000 aaaaaaaaaa done
                Simulation 10000 .......... done
                 
                file test_simulation.dta saved.
                 
                Multilevel structure
                ------------------------------------------------------------------------------
                region
                    {U0}: random intercepts
                 
                _all
                    {V0}: random intercepts
                    {V1}: random coefficients for interid_reg1
                    {V2}: random coefficients for interid_reg2
                    {V3}: random coefficients for interid_reg3
                 
                urban
                    {W0}: random intercepts
                    {W1}: random coefficients for public
                ------------------------------------------------------------------------------
                 
                Model summary
                ------------------------------------------------------------------------------
                Likelihood:
                  math ~ normal(xb_math,{e.math:sigma2})
                 
                Priors:
                  {math:parented meanses public _cons} ~ normal(0,1)                       (1)
                                                  {U0} ~ normal(0,{U:sigma2})              (1)
                                            {V0 V1 V2} ~ normal(0,{V:sigma2})              (1)
                                                  {V3} ~ normal(0,{V3:sigma2})             (1)
                                                  {W0} ~ normal(0,{W0:sigma2})             (1)
                                                  {W1} ~ normal(0,{W1:sigma2})             (1)
                                       {e.math:sigma2} ~ igamma(.01,.01)
                 
                Hyperpriors:
                   {V:sigma2} ~ normal(0,1)
                   {U:sigma2} ~ normal(0,1)
                  {V3:sigma2} ~ normal(0,1)
                  {W0:sigma2} ~ normal(0,1)
                  {W1:sigma2} ~ normal(0,1)
                ------------------------------------------------------------------------------
                (1) Parameters are elements of the linear form xb_math.
                 
                Bayesian multilevel regression                   MCMC iterations  =     20,000
                Metropolis–Hastings and Gibbs sampling           Burn-in          =     10,000
                                                                 MCMC sample size =     10,000
                 
                -------------------------------------------------------------
                                |     No. of       Observations per group
                 Group variable |     groups    Minimum    Average    Maximum
                ----------------+--------------------------------------------
                           _all |          1        519      519.0        519
                          urban |          1        519      519.0        519
                -------------------------------------------------------------
                 
                                                                 Number of obs    =        519
                                                                 Acceptance rate  =      .4091
                                                                 Efficiency:  min =    .003519
                                                                              avg =     .05733
                Log marginal-likelihood                                       max =      .3529
                 
                ------------------------------------------------------------------------------
                             |                                                Equal-tailed
                             |      Mean   Std. dev.     MCSE     Median  [95% cred. interval]
                -------------+----------------------------------------------------------------
                math         |
                    parented |  4.227816   .3049179    .03058   4.215109    3.65802   4.863543
                     meanses |  .0491451   .8224896   .059018   .0223693  -1.491569   1.776059
                      public |  2.123705   .7945473   .058257   2.116638   .5409869   3.643668
                       _cons |  4.746302   .9495195   .069709   4.774811    2.76222   6.617982
                -------------+----------------------------------------------------------------
                region       |
                    U:sigma2 |  3.746065   .8538998   .076514   3.722601   2.143132   5.500148
                -------------+----------------------------------------------------------------
                _all         |
                    V:sigma2 |   5.28936   .5071606   .085495    5.20253   4.428834    6.24269
                   V3:sigma2 |  .8836466   .6373397   .022307   .7524564     .04531   2.435369
                -------------+----------------------------------------------------------------
                urban        |
                   W0:sigma2 |  3.765399   .8039043   .065444   3.748756   2.219402   5.369989
                   W1:sigma2 |  1.446508   .7196328   .035482   1.369089   .2560512   2.996493
                -------------+----------------------------------------------------------------
                e.math       |
                      sigma2 |  83.14786   6.124751   .103099   82.75787   72.33765   96.36645
                ------------------------------------------------------------------------------
                Note: Default priors are used for some model parameters.
                Note: There is a high autocorrelation after 500 lags.
                Note: Adaptation tolerance is not met in at least one of the blocks.
                100,000
                Code:
                . bayes, prior({math:} {V:sigma2} {U:sigma2} {V3:sigma2} {W0:sigma2} {W1:sigma2} , normal(0,1)) burnin(100000) dots(1000) rseed(42) saving(test_simulation, replace): mixed math parented meanses public || _all: R.region ||  _
                > all:inter*, cov(identity) nocons || urban: public
                note: Gibbs sampling is used for some variance components.
                 
                Burn-in 100000 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa done
                Simulation 10000 .......... done
                 
                file test_simulation.dta not found; file saved.
                 
                Multilevel structure
                ------------------------------------------------------------------------------
                region
                    {U0}: random intercepts
                 
                _all
                    {V0}: random intercepts
                    {V1}: random coefficients for interid_reg1
                    {V2}: random coefficients for interid_reg2
                    {V3}: random coefficients for interid_reg3
                 
                urban
                    {W0}: random intercepts
                    {W1}: random coefficients for public
                ------------------------------------------------------------------------------
                 
                Model summary
                ------------------------------------------------------------------------------
                Likelihood:
                  math ~ normal(xb_math,{e.math:sigma2})
                 
                Priors:
                  {math:parented meanses public _cons} ~ normal(0,1)                       (1)
                                                  {U0} ~ normal(0,{U:sigma2})              (1)
                                            {V0 V1 V2} ~ normal(0,{V:sigma2})              (1)
                                                  {V3} ~ normal(0,{V3:sigma2})             (1)
                                                  {W0} ~ normal(0,{W0:sigma2})             (1)
                                                  {W1} ~ normal(0,{W1:sigma2})             (1)
                                       {e.math:sigma2} ~ igamma(.01,.01)
                 
                Hyperpriors:
                   {V:sigma2} ~ normal(0,1)
                   {U:sigma2} ~ normal(0,1)
                  {V3:sigma2} ~ normal(0,1)
                  {W0:sigma2} ~ normal(0,1)
                  {W1:sigma2} ~ normal(0,1)
                ------------------------------------------------------------------------------
                (1) Parameters are elements of the linear form xb_math.
                 
                Bayesian multilevel regression                   MCMC iterations  =    110,000
                Metropolis–Hastings and Gibbs sampling           Burn-in          =    100,000
                                                                 MCMC sample size =     10,000
                 
                -------------------------------------------------------------
                                |     No. of       Observations per group
                 Group variable |     groups    Minimum    Average    Maximum
                ----------------+--------------------------------------------
                           _all |          1        519      519.0        519
                          urban |          1        519      519.0        519
                -------------------------------------------------------------
                 
                                                                 Number of obs    =        519
                                                                 Acceptance rate  =      .4389
                                                                 Efficiency:  min =    .002596
                                                                              avg =      .0235
                Log marginal-likelihood                                       max =     .08252
                 
                ------------------------------------------------------------------------------
                             |                                                Equal-tailed
                             |      Mean   Std. dev.     MCSE     Median  [95% cred. interval]
                -------------+----------------------------------------------------------------
                math         |
                    parented |  4.111427    .332615   .065279   4.100525    3.52727   4.874083
                     meanses |  .0550377   .8415122   .085535   .0217303  -1.568142   1.792977
                      public |  2.029672   .8472618   .082827   2.071854   .1924822   3.618146
                       _cons |  4.724996   .9894112   .149306   4.772051   2.661189   6.558499
                -------------+----------------------------------------------------------------
                region       |
                    U:sigma2 |  3.922214   .8262815   .072049    3.92789   2.309534   5.510978
                -------------+----------------------------------------------------------------
                _all         |
                    V:sigma2 |  5.178442   .6120028   .093315   5.191006   3.970948   6.365479
                   V3:sigma2 |  .8555653   .6095647    .02122   .7479953   .0509334   2.308674
                -------------+----------------------------------------------------------------
                urban        |
                   W0:sigma2 |  3.851325   .8386632   .084985   3.830769   2.264572   5.528421
                   W1:sigma2 |  1.481297   .7378747   .039616   1.438336   .2123395   3.057861
                -------------+----------------------------------------------------------------
                e.math       |
                      sigma2 |  82.06462    5.77319   .229178   81.67425   71.84723   94.25149
                ------------------------------------------------------------------------------
                Note: Default priors are used for some model parameters.
                Note: There is a high autocorrelation after 500 lags.
                So you're right, with weakly informative priors, longer burn-ins don't help that much.
                Thanks!
                Alex

                Comment


                • #9
                  Originally posted by Alexander Koplenig View Post
                  Will do!
                  Thank you. I was regrettably harsh in my choice of terms above, but it will help everyone to bring these findings to the attention of the company so that future users interested in Bayesian analysis won't have to blaze the trail.

                  Comment


                  • #10
                    II founded that while running a BVAR, the -bayes:var- command, it triggered a -conformability error-. It cannot handle the so-called Large VARs.
                    Furthermore, on -bayes , prior assumption is a key. Is there a way to have Stata adopt priors for BVARs in line to Domenico Giannone, Michele Lenza, and Giorgio E. Primiceri , REVSTAT 2015?
                    https://hedibert.org/wp-content/uplo...15-REVSTAT.pdf

                    Comment


                    • #11
                      Originally posted by Mario Ferri View Post
                      II founded that while running a BVAR, the -bayes:var- command, it triggered a -conformability error-.
                      You'll get better odds of a reply if you start a dedicated thread with a descriptive title. It'll help, too, if you follow the other recommendations in the forum's FAQ regarding example dataset, what you typed & got back etc.

                      Comment


                      • #12
                        Originally posted by Alexander Koplenig View Post
                        I would like to point out a problem that occurs when running various rather complex Bayesian multilevel models using the bayes prefix. . . .I try to run a crossed effects model with two ('fixed') covariates, two crossed effects and random coefficients ('slopes') in both levels . .
                        If I’ve read the user’s manual correctly, then it’s easy in bayesmh to specify a cross-classified random effects model with random slopes for each grouping variable—actually easier in bayesmh natively than with bayes: mixed. See below for an example fitting your sought-after model using your attached dataset. (I’ve shortened the variable names for readability.)

                        That’s the good news. The bad news is that you’ll discover an ugly secret about the corrugated surface of the likelihoods of such models, especially with relatively few members of the grouping factors with which to define their variances. In explorations with various simulated datasets, I’ve seen traces of multiple chains running in parallel across the graph, each chain happily ensconced in its own local maximum, and I’ve seen cases where the second chain flatlines, scarcely budging from initial values.

                        So, you can try something along the lines below, but unless you have a lot of well-behaved data, expect the traces to jump from one local maximum to another giving bimodal posterior density histograms like those seen when you run the code below. And expect that the model fit to be relatively sensitive to choice of priors and initial values. I know that what you attached to your original post is only a subset of your dataset, but given that three of your four variance components there are not estimable and have collapsed to zero in the maximum likelihood fit with mixed, you might find that a more parsimonious model will be required in order to get a solid fit even with the full dataset.
                        Code:
                        version 18.0
                        
                        clear *
                        
                        use test_data
                        rename (covariate# Group# outcome) (co# gr# out)
                        
                        quietly tabulate gr2, generate(c)
                        local r `r(r)'
                        foreach var of varlist c1-c`r' {
                            generate double g`var' = `var' * co2
                        }
                        /* mixed outcome covariate1 covariate2
                            || _all: R.Group2 || _all: inter*, cov(identity) nocons || Group1: covariate2 */
                        mixed out c.(co?) ///
                            || _all: R.gr2 || _all:gc1-gc`r', covariance(identity) noconstant ///
                                || gr1: co2, nogroup nolrtest nolog
                        
                        *
                        * Begin here
                        *
                        bayesmh out c.co1 c.co2 U0[gr1] U1[gr2] c.co2#U2[gr1] c.co2#U3[gr2], ///
                            likelihood(normal({Var})) ///
                            prior({out:}, normal(1)) ///
                                prior({var_U0 var_U1 var_U2 var_U3 Var}, igamma(0.1, 0.1)) ///
                            rseed(466245242) thinning(5)
                        
                        set more on
                        foreach graph in trace ac histogram {
                            foreach parm in co1 co2 _cons var_U0 var_U1 var_U2 var_U3 Var {
                                bayesgraph `graph' {`parm'}
                                more
                            }
                        }
                        set more off
                        
                        exit

                        Comment

                        Working...
                        X