Multiple Imputated Datasets for imputing MNAR data

Diana Rossi

Join Date: Apr 2018

Posts: 8
#1

Multiple Imputated Datasets for imputing MNAR data

21 Sep 2018, 17:24

Dear all,

I'm trying to translate in Stata syntax teh following operation to impure MNAR data:

"Use Multiple Imputation (MI) to replace missing values. MI can be conducted in SPSS as a dedicated function and has 3 steps. First missing data are replaced using an EM algorithm augmented by a Bayesian procedure (conditional posterior distribution – you don’t need to know what this means, even I struggle with it!), which yields multiple imputated data sets. The second step involves analyzing each of the yielded data sets separately with standard statistics (e.g., linear regression). The third step involves aggregating results from each separate data set and calculating standard errors for significance testing on the basis of both within- and between-data set variance. Researchers argue that a small number of MI data sets (m =10) will be adequate for most situations. The main advantage of MI is that by yielding multiple data sets, researchers can calculate the ‘true’ uncertainty (accounting for both within- and between-imputation variance) associated with analyses using missing data, and therefore it overcomes the problem of underestimated SEs using single data sets produced by EM. Another advantage of MI is that is performs well under MCAR, MAR, and MNAR, and is robust to large amounts of missing data (i.e., > 10%). An obvious drawback, however, is that MI provides for a cumbersome analysis with more than one data set to consider. It also provides different estimates with every execution meaning the results are not determinate. Nonetheless, this technique is well suited to analyses when there is a substantial proportion of missing data due to some systematic reason(s). "

Any suggestions about which commands to apply?

Thank you very much,
Diana
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30147
#2

21 Sep 2018, 18:01

The PDF documentation that comes installed with your Stata has an entire volume, [MI], devoted to this. There's a lot to read, but the documentation is well written and contains numerous worked examples. To access the PDF documentation, type -help MI- and then click on the link "(View complete PDF manual entry)" near the top of the page. Settle in for a while. Get some snacks. Actually, get a few meals.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17724
#3

22 Sep 2018, 02:06

Diana:
as an aside to Clyde's (as always) helpful advice, you may want to take a look at:
- van Buuren S, Boshuizen HC, Knook DL.Multiple imputation of missing blood pressure covariates in survival analysis.Stat Med. 1999 Mar 30;18(6):681-94;
- Stef van Buuren. Flexible imputations of missing data. 2nd ed. Boca Raton, FL: Chapman and Hall/CRC, 2018;
- Leurent B, Gomes M, Faria R, Morris S, Grieve R, Carpenter JR. Sensitivity Analysis for Not-at-Random Missing Data in Trial-Based Cost-Effectiveness Analysis: A Tutorial. Pharmacoeconomics. 2018 Aug;36(8):889-901. doi: 10.1007/s40273-018-0650-5 (the artcle comes with supplementary materials on Stata codes used by Authors).

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Multiple Imputated Datasets for imputing MNAR data

Comment

Comment