Multiple imputation by chained equations and summing imputed variables across the individual (CEA analysis)

Sebby Zajac

Join Date: May 2016

Posts: 8
#1

Multiple imputation by chained equations and summing imputed variables across the individual (CEA analysis)

05 May 2016, 14:10

Hey there, I'm sure this has a simple answer, but as someone new to Stata I'm finding getting the answer a little difficult.

I'm doing a CEA for a small pilot study which uses survey data to collect costs and resources, and unfortunately there's a little bit of missingness among survey responders (about 3%) and about a 8% survey non-response rate. My PI has asked me to do a complete case analysis (which is available for 94% of survey responders, and obviously 0% among non-responders) and the crux of my questions... a MI analysis. I knew next to nothing about MI, but after reviewing a few textbooks and the literature I have planned a MICE strategy based on the pattern of missingness and number of variables with missing data.

My problem is that because we are imputing at the disaggregate level (so individual cost and resource variables, these are important in our analysis so imputing at the aggregate doesn't work) and then summing these variables together (in a micro-costing approach) I'm not sure how Stata deals with this sorta simple summation across the level of the individual to generate a total cost (there are about 9 variables that are summed to generate the final costs). Beyond that, because some variables that are imputed will be resources consumed, they will need to be multiplied by a "unit-cost" (e.g. # of hospital nights*the cost of a night in hospital specific to the hospital for the individual) before the summing to total costs.

It's my understanding that basically this will require a fair bit of registering variables as passive to generate "intermediate" costs from resources*unit-cost and also summing across individuals for "total costs". But is there anything I'm missing? Is it as simple as generating an imputation model (PMM for costs, ordinal reg for resources etc.) and then using mi xeq to register a passive variable named "total costs" as the sum of the individual costs? (obviously with the background legwork for MCAR/MAR/MNAR, patterns of missingness etc. already done)

And once these are summed to a total cost, how does Stata use the new total costs in descriptive or regression analyses (as the independent or dependent variable)? Does everything get combined using Rubin's rules for the analysis?

Thanks!
Sebby
Tags: None
Jin Russell

Join Date: Jan 2016

Posts: 15
#2

16 Jun 2016, 18:15

Hi Sebby

I don't have a solution to your question here except to share that like you, I am new to MICE, and also have the very same issue with my dataset. I am using MICE to fill in the missing values in a large dataset with about 7000 cases and many variables. Once I've done that, I need to create 'health indices' for each individual case - that is, I will need to create a 'total health index score' variable that is the sum of other (imputed) variables.

I can't see how to do this using mi estimate command, because mi estimate creates pooled estimates based on Rubin's analysis, and this sort of 'sum' command isn't supported by the mi estimate command.

The best I can think to do here is to use mi xeq command to create the 'total health index score' variable (passive) in each imputed dataset, and then to average these some how.

Did you figure out what to do for this? I'd be really interested.

Warmly

Jin
Comment
Oded Mcdossi

Join Date: Jun 2014

Posts: 577
#3

17 Jun 2016, 00:21

I think what you need to do in this case is, first, to mi impute all of your missing variables that create the total index and then to use mi passive: egen total_health_index_score=rowtotal(item1-item20).
You don't need to average the total index across all of the imputed datasets, since this is meaningless. If you like to report the mean of the total index, all you need is to use one of the following commands, which relevant to flong format:

Code:

//for the mean only: sum total_health_index_score if _mi_m>0 //or mi estimate: mean total_health_index_score //for means and standard deviations you should install the misum (SSC) first and read the help file: ssc install misum help misum misum total_health_index_score
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17704
#4

17 Jun 2016, 00:33

Sebby:
if you're doing a cost-effectiveness analysis (CEA) with missing values, the following reference might be of interest http://www.ncbi.nlm.nih.gov/pubmed/12720255.
As an aside, please note that this a multidisciplinary forum and using acronyms may not be the best way to capture the attention of listers engaged in different resarch fields who may well reply positively.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Multiple imputation by chained equations and summing imputed variables across the individual (CEA analysis)

Comment

Comment

Comment