Hi StataList,
I am doing a research on subjective well-being, using cross-sectional survey-data for two years (not panel). The integrated datset have some missing values for my dependant varibales, Life satisfaction (8 missing values) and Happiness (28 missing values). Even though this may not sound like a big number, I am focusing on a small groups from my dataset, on each year separately and every obseravtion matters for the size of my sample. So I decided to proceed with multiple imputation. I followed the steps described in the following book: Mehmet Mehmetoglu and Tor Georg Jakobsen (2016) 'Applied Statistics Using Stata: A Guide for the Social Sciences' and did the process only for Life Satisfaction first. I also compared my comands with some youtube videos and it looks all good.
The results I am getting for the regression after the imputation are based on the entire sample size, which is 2014. I assume this means that the imputation was correct and succesful. However,once I save my dataset and then reopen to run regressions, it gives me results where the number of the observations 2070 and this now exceeds the regular size of the sample. How is this possible? Where did I make mistake? I guess I need to do something different when I am saving my data. Or should I use always 'mi estimate: regress...' even after the imputation was done and the data was saved? I assume there is a way to save the data with imputed variable as a new dataset where I can then run different analysis without incluidng 'mi estimate' every time.
I really appreciate your time to read my post and come with any suggestions,
Best wishes,
Mirjana
I am doing a research on subjective well-being, using cross-sectional survey-data for two years (not panel). The integrated datset have some missing values for my dependant varibales, Life satisfaction (8 missing values) and Happiness (28 missing values). Even though this may not sound like a big number, I am focusing on a small groups from my dataset, on each year separately and every obseravtion matters for the size of my sample. So I decided to proceed with multiple imputation. I followed the steps described in the following book: Mehmet Mehmetoglu and Tor Georg Jakobsen (2016) 'Applied Statistics Using Stata: A Guide for the Social Sciences' and did the process only for Life Satisfaction first. I also compared my comands with some youtube videos and it looks all good.
The results I am getting for the regression after the imputation are based on the entire sample size, which is 2014. I assume this means that the imputation was correct and succesful. However,once I save my dataset and then reopen to run regressions, it gives me results where the number of the observations 2070 and this now exceeds the regular size of the sample. How is this possible? Where did I make mistake? I guess I need to do something different when I am saving my data. Or should I use always 'mi estimate: regress...' even after the imputation was done and the data was saved? I assume there is a way to save the data with imputed variable as a new dataset where I can then run different analysis without incluidng 'mi estimate' every time.
I really appreciate your time to read my post and come with any suggestions,
Best wishes,
Mirjana
Comment