Hi,
After all the helpful advice I have been given the last couple of days from members in this forum, I am now hoping that this is my last question "for now":
Do anyone know how to bootstrap a dataset with missing data and to use the result for imputation with, preferably, mi impute? I have a dataset with data from a clinical study where health-related quality of life (EQ-5D questionnaire) has been collected at a number of occations, but many participants have not responded to all questionnaires.
Bootstrap should apparently be implemented before the imputation, according to Schomaker and Heumann. In a previous question I was suggested to try out the programs/codes provided by Glick for bootstrapping and analysis of cost-effectiveness/-utility data. I have also found the code provided by Faria and colleagues (in the supplemental material to an article from 2014), which does multiple imputation first and afterwards bootstrappes to get the results. However, I have not been able to find a method for doing the opposite, bootstrap before mi impute, or to adjust the code from these two sources for my needs. Does anyone have a suggestion on how to do this?
* The main issue at hand is however the application of mi impute to my boostrapped data with missing information? Does anyone know, or have a suggestion on how to get started on getting my bootstrapped results to be useful for imputation, and getting mi impute to bootstrap all of my results from the bootstrap. I have "a feeling" that this is somewhat connected to my difficulties in understanding scalars and matrises, se below.
* One of the problems is that I have difficulties understanding in which scalars, matrises et cetera "things" are put, as the names of these appears to differ between the ones that I have found in the previous codes and compared to the ones I find in my Stata: Maybe there are differences between Stata versions? Does anyone have a suggestion for reading about this that is more accessible for someone with no previous experience of this type of data, than the pdf manual or help-files. For example, in the code by Faria and colleagues they are refering to beta[1,1] and vari[3,3], and other. I cannot find any such in my "list", and I do not know how to read the scalars and matrises well enough to be able to translate to what I find in my list or in the manual and help-file. I'm sorry for the confused question but I don't really know how to explain my problem better.
* A related issue, of course, that the code by Faria and collegues gives a hint on how I could solve it (through mi impute chained), is that many participants have responded to most but not all questionnaires. Thus, I will not be able to calculate the quality adjusted life years beforehand, but will have to first do the bootstrap, thereafter chained imputation of the five responses to the health-related quality of life questionnaire (baseline, 4 weeks, 8 weeks, 18 weeks and 52 weeks: looking at the Faria-code I guess I will have to create a number of new variables indicating the quality adjusted life years during different same-length time intervals of the studied year), and finally to calculate the quality adjusted life years. As I have not managed to get any of my attempted codes to work yet, I cannot say how this will influence other parts of the code.
I am using Stata 14.
I would be happy to get any advise on how to get on with this!
Kind regards,
Hanna
After all the helpful advice I have been given the last couple of days from members in this forum, I am now hoping that this is my last question "for now":
Do anyone know how to bootstrap a dataset with missing data and to use the result for imputation with, preferably, mi impute? I have a dataset with data from a clinical study where health-related quality of life (EQ-5D questionnaire) has been collected at a number of occations, but many participants have not responded to all questionnaires.
Bootstrap should apparently be implemented before the imputation, according to Schomaker and Heumann. In a previous question I was suggested to try out the programs/codes provided by Glick for bootstrapping and analysis of cost-effectiveness/-utility data. I have also found the code provided by Faria and colleagues (in the supplemental material to an article from 2014), which does multiple imputation first and afterwards bootstrappes to get the results. However, I have not been able to find a method for doing the opposite, bootstrap before mi impute, or to adjust the code from these two sources for my needs. Does anyone have a suggestion on how to do this?
* The main issue at hand is however the application of mi impute to my boostrapped data with missing information? Does anyone know, or have a suggestion on how to get started on getting my bootstrapped results to be useful for imputation, and getting mi impute to bootstrap all of my results from the bootstrap. I have "a feeling" that this is somewhat connected to my difficulties in understanding scalars and matrises, se below.
* One of the problems is that I have difficulties understanding in which scalars, matrises et cetera "things" are put, as the names of these appears to differ between the ones that I have found in the previous codes and compared to the ones I find in my Stata: Maybe there are differences between Stata versions? Does anyone have a suggestion for reading about this that is more accessible for someone with no previous experience of this type of data, than the pdf manual or help-files. For example, in the code by Faria and colleagues they are refering to beta[1,1] and vari[3,3], and other. I cannot find any such in my "list", and I do not know how to read the scalars and matrises well enough to be able to translate to what I find in my list or in the manual and help-file. I'm sorry for the confused question but I don't really know how to explain my problem better.
* A related issue, of course, that the code by Faria and collegues gives a hint on how I could solve it (through mi impute chained), is that many participants have responded to most but not all questionnaires. Thus, I will not be able to calculate the quality adjusted life years beforehand, but will have to first do the bootstrap, thereafter chained imputation of the five responses to the health-related quality of life questionnaire (baseline, 4 weeks, 8 weeks, 18 weeks and 52 weeks: looking at the Faria-code I guess I will have to create a number of new variables indicating the quality adjusted life years during different same-length time intervals of the studied year), and finally to calculate the quality adjusted life years. As I have not managed to get any of my attempted codes to work yet, I cannot say how this will influence other parts of the code.
I am using Stata 14.
I would be happy to get any advise on how to get on with this!
Kind regards,
Hanna
Comment