Can -bayesmh- do data augmentation for Bayesian analysis?

Kirin_Guess

Join Date: Sep 2014
Posts: 40

Can -bayesmh- do data augmentation for Bayesian analysis?

26 Jun 2015, 22:18

Hi Statalist,

I would like to ask whether Stata's Bayesain analysis command bayesmh can do data augmentation to handle missing data or to make the estimation easier.

For the sake of demonstration, let's take the Probit model for example. Using Stata's built-in command, we can estimate the Probit model by:

Code:

use "http://www.stata-press.com/data/r14/auto", clear
set seed 1234
bayesmh foreign mpg, likelihood(probit) ///
                     prior({foreign:mpg  }, normal(0,10000)) ///
                     prior({foreign:_cons}, normal(100,0.5)) dots



Bayesian probit regression                       MCMC iterations  =     12,500
Random-walk Metropolis-Hastings sampling         Burn-in          =      2,500
                                                 MCMC sample size =     10,000
                                                 Number of obs    =         74
                                                 Acceptance rate  =      .2111
                                                 Efficiency:  min =     .08724
                                                              avg =     .08743
Log marginal likelihood = -7390.0051                          max =     .08762
 
------------------------------------------------------------------------------
             |                                                Equal-tailed
     foreign |      Mean   Std. Dev.     MCSE     Median  [95% Cred. Interval]
-------------+----------------------------------------------------------------
         mpg | -1.012607   .0132892    .00045   -1.01248   -1.03923  -.9864686
       _cons |  20.43789   .1596249   .005393   20.43533   20.12213    20.7582
------------------------------------------------------------------------------

However, I would like to learn how to use bayesmh evaluator to programme the algorithm of data augmentation.
If I understand correctly, the data augmentation algorithm of the Probit model can be done by two steps:

1. simulate the latent dependent variable y* from the truncated normal distributions, given the observed binary-choice dependent variable y, independent variable X, and the model parameters B.

2. simulate the model parameters B from the posterior density of the linear regression using y*, instead of y, as the dependent variable.

For more details about data augmentation, please refer to Simon Jackman's, Bayesian Analysis for the Social Sciences, page 381-2 (https://books.google.co.uk/books?id=...page&q&f=false).
-------------------------------------------------------

Click image for larger version

Name: Probit - Data augmentation.png
Views: 1
Size: 157.8 KB
ID: 1300241

----- Sorry. If showing this picture here violates any rules, I'll remove it. -------------------------

Here is my code for doing data augmentation, but apparently it produced wrong result, and I don't know how to fix it. Could anyone help me to sort out this problem? Thank you very much.

Code:

cap prog drop ProbitLnP
prog ProbitLnP
 version 14
 args lnp xb

/* simulate ystar from the truncate normal(0,+inf) if y=1*/
 tempvar ystar
 qui gen double `ystar' = invnormal(0.5+runiform()*(1-0.5))*1 + `xb' ///
                          if ${MH_touse} & ${MH_y}==1

/* simulate ystar from the truncate normal(-inf,0) if y=0*/
 qui replace    `ystar' = invnormal(0  +runiform()*(0.5-0))*1 + `xb' ///
                          if ${MH_touse} & ${MH_y}==0

/* compute log likelihood of the simple linear regression using y* as the dep. var. */
 tempvar lnfj
 qui gen double `lnfj' = lnnormalden(`ystar', `xb', 1) if ${MH_touse}
 qui summ `lnfj', meanonly
 if r(N) < $MH_n {
  scalar ‘lnp’ = .
  exit
 }
 tempname lnf
 scalar  `lnf' = r(sum)

/* compute log prior of the parameters */
 tempname lnprior
 scalar  `lnprior' = lnnormalden($MH_b[1,1],   0, sqrt(10000)) + ///
                     lnnormalden($MH_b[1,2], 100, sqrt(0.5)  )

/* compute log posterior */
 scalar  `lnp' = `lnf' + `lnprior'
end

set seed 1234
bayesmh foreign mpg, evaluator(ProbitLnP) dots



Bayesian regression                              MCMC iterations  =     12,500
Random-walk Metropolis-Hastings sampling         Burn-in          =      2,500
                                                 MCMC sample size =     10,000
                                                 Number of obs    =         74
                                                 Acceptance rate  =    .007199
Log marginal likelihood =          .             Efficiency       =          1
 
------------------------------------------------------------------------------
             |                                                Equal-tailed
     foreign |      Mean   Std. Dev.     MCSE     Median  [95% Cred. Interval]
-------------+----------------------------------------------------------------
         mpg |  2.234445          0         0   2.234445   2.234445   2.234445
       _cons |  101.6031          0         0   101.6031   101.6031   101.6031
------------------------------------------------------------------------------

---
Cheers,
Kirin

Last edited by Kirin_Guess; 26 Jun 2015, 22:24.

Tags: bayesmh, data augmentation

Nikolay Balov (StataCorp)

StataCorp Employee

Join Date: Apr 2015

Posts: 37
#2

29 Jun 2015, 09:45

Hi Kirin,

currently, bayesmh does not support custom Gibbs sampling and you cannot use bayesmh to implement the cited Algorithm 8.1. You have made efforts to reproduce the augmentation step of the algorithm in your evaluator, but unfortunately, there is no way the generated y* values to be passed back and used by bayesmh.
Your only option at this moment is to program Algorithm 8.1 directly in Stata or Mata and summarize the resulting sequence by yourself.

Regards,
Nikolay
Comment

Announcement

Can -bayesmh- do data augmentation for Bayesian analysis?

Comment