Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • New on SSC: bayeshmc — Bayesian Regression via Hamiltonian Monte Carlo (NUTS) Using CmdStan

    New on SSC: bayeshmc — Bayesian Regression via Hamiltonian Monte Carlo (NUTS) Using CmdStan

    Dear Statalisters,

    I am pleased to announce that bayeshmc (version 4.2.0) is now available from SSC. To install, type in Stata:


    ssc install bayeshmc What bayeshmc does

    bayeshmc brings the No-U-Turn Sampler (NUTS), Stan's adaptive Hamiltonian Monte Carlo algorithm, to Stata by interfacing with CmdStan. It provides a bayesrefix-style syntax for fitting Bayesian regression models using gradient-based sampling, offering substantially higher effective sample sizes per draw compared to the random-walk Metropolis-Hastings sampler used by Stata's built-in bayes: prefix. Why use bayeshmc?

    Stata's native bayes: prefix uses adaptive Metropolis-Hastings (MH), which explores the posterior via random-walk proposals. This works well for simple models but can struggle with correlated parameters, hierarchical structures, and high-dimensional posteriors. HMC/NUTS uses gradient information to make large, directed moves through parameter space, dramatically reducing autocorrelation in the chains. In practice, this means:
    • Higher effective sample sizes (ESS) from the same number of posterior draws
    • Faster convergence for complex models, particularly multilevel and hierarchical specifications
    • Reliable sampling in high dimensions where random-walk MH exhibits slow mixing
    Supported model families

    bayeshmc supports a wide range of regression models:
    • Continuous outcomes: regress, tobit, truncreg, intreg, betareg, glm
    • Binary outcomes: logit, probit, cloglog
    • Count outcomes: poisson, nbreg, gnbreg, tpoisson, zip, zinb
    • Ordinal outcomes: ologit, oprobit
    • Multinomial outcomes: mlogit
    • Survival models: streg (Weibull AFT)
    • Heteroscedastic models: hetregress, hetprobit, hetoprobit
    • Selection models: heckman, heckprobit
    • Multilevel/panel models: xtreg, mixed, melogit, meprobit, mepoisson, menbreg, meologit, meoprobit, mecloglog, mestreg, meglm — with random-intercept and unstructured covariance specifications
    • Covariance priors for multilevel models: LKJ (default), inverse-Wishart, scaled inverse-Wishart, Huang-Wand, and spherical decomposition
    Syntax

    bayeshmc uses a familiar bayes:-like prefix syntax:


    bayeshmc [, options] : estimation_command Examples

    stata
    * Linear regression sysuse auto, clear bayeshmc, iter(5000) warmup(2000) seed(12345) : regress price mpg weight foreign * Multilevel logistic regression webuse bangladesh, clear bayeshmc, iter(5000) warmup(2000) seed(54321) : melogit c_use urban age || district: * Multilevel model with unstructured covariance and LKJ prior bayeshmc, iter(5000) warmup(2000) lkjprior(2) seed(99999) : mixed score treatment time || school: time, covariance(unstructured) Post-estimation commands

    After estimation, bayeshmc provides a suite of post-estimation tools:


    stata
    bayeshmc summary // Posterior summary table bayeshmc ess // Effective sample size and R-hat diagnostics bayeshmc trace // Trace plots for visual convergence assessment bayeshmc density // Kernel density plots of posterior distributions bayeshmc ac // Autocorrelation plots bayeshmc histogram // Posterior histograms bayeshmc waic // Watanabe-Akaike Information Criterion bayeshmc loo // Pareto-smoothed LOO cross-validation (PSIS-LOO) Key options

    iter(#) Post-warmup draws per chain (default: 2000)
    warmup(#) Warmup/adaptation draws per chain (default: 1000)
    chains(#) Number of MCMC chains (default: 4)
    seed(#) Random number seed for reproducibility
    parallel Run chains in parallel (default on multi-chain runs)
    threads(#) CPU threads for within-chain parallelism
    priorsd(#) Standard deviation for normal priors on coefficients (default: 10)
    lkjprior(#) LKJ concentration parameter for correlation matrices (default: 1)
    covprior(string) Covariance prior: lkj, iw, siw, huangwand, spherical
    level(#) Credible interval level (default: 95)
    Requirements

    After installing CmdStan, point bayeshmc to it:


    stata
    bayeshmc, setup path(/path/to/cmdstan) How it works

    bayeshmc automatically generates Stan model code, exports data in JSON format, compiles and runs the Stan model through CmdStan, parses the posterior draws, and returns results to Stata's e() framework. All of this is transparent to the user — the workflow feels native to Stata. Estimation results

    bayeshmc posts standard Stata e() results including e(b) and e(V), plus additional matrices e(stats) (posterior summary), e(ess) (effective sample sizes), and e(rhat) (Gelman-Rubin convergence diagnostics) for all parameters. Further information

    A comprehensive companion textbook, Bayesian Regression Using Hamiltonian Monte Carlo in Stata: A Comprehensive Guide with bayeshmc, is forthcoming from the author. The book covers the theory of HMC/NUTS, all supported model families with worked examples, convergence diagnostics, model comparison via WAIC and LOO-CV, and advanced topics including multilevel models with various covariance priors.

    Type help bayeshmc after installation for complete documentation.

    I welcome feedback, bug reports, and feature requests.

    Best wishes,

    Ben Adarkwa Dwamena Clinical Associate Professor Emeritus of Radiology Division of Nuclear Medicine and Molecular Imaging University of Michigan, Ann Arbor
Working...
X