Hello!
I've been trying to do the multiple imputation procedure on the panel data set.
According to common wisdom first I reshaped my data from long to wide format, and then launched the imputation procedure.
The process crashed with the following error message:
I have 16 variables for 120 countries observed from 2005 to 2012. Clearly, the problem with MI procedure was caused by the fact, that I got 128 new variables ((2012-2005+1)*16) as the result of reshaping my data from long to wide, and had only 120 countries to observe.
I did some search on the web and found the following information about MI for the panel data:
The only possible solution I could think of to fight the problem described above was to impute the missing data on shorter time intervals. Empirically I found that 3 years period (and, hence only (2012-2005+1)*3 = 24 new variables for 120 countries) was OK for the MI procedure.
As the result I got the imputed data for three periods: 2005-2007, 2008-2010, 2011-2012.
My question is:
Can I merge the MI procedure results from the sub-periods (2005-2007, 2008-2010, 2011-2012) into the single period (2005-2012) and go on with my analysis, or must I perform the imputation and panel data analysis on the same intervals (and, hence, perform panel data analysis three times)?
I wasn't able to find the definite answer on the question above. However, all the authors agree that imputation model and analytical model should be parsimonious, but the clear guidelines on the extent of this parsimony are missing .
Thank you!
I've been trying to do the multiple imputation procedure on the panel data set.
According to common wisdom first I reshaped my data from long to wide format, and then launched the imputation procedure.
The process crashed with the following error message:
Code:
imputing m=1 through m=27 mi impute: VCE is not positive definite
The posterior distribution from which mi impute drew the imputations for Tax_1996 is not
proper when the VCE estimated from the observed data is not positive definite. This may
happen, for example, when the number of parameters exceeds the number of observations. Choose
an alternate imputation model.
I did some search on the web and found the following information about MI for the panel data:
http://www.stata.com/statalist/archi.../msg00198.html :
http://www.stata.com/support/faqs/st...and-mi-impute/ - nothing helpful for my case"Neither -ice- nor -mi impute- has an imputation method specifically designed for panel data. (The -mi xtset- command does declare panel data but does not change which imputation methods are available.) We do, however, have a FAQ that has a few suggestions for applying -mi impute- to panel data."
http://www.ats.ucla.edu/stat/stata/f...ngitudinal.htm :
http://www.ssc.wisc.edu/sscc/pubs/stata_mi_models.htm :"Once we are familiar with our data, the first step in the imputation process is to reshape the data from long to wide. Having the data in wide form takes care of both the nesting issue (there is now only one row of data per student) and allows us to easily use variables from the other time periods as predictors of missing values, since in wide form, they are just other variables in the dataset (rather than being part of another row in the dataset). We do this using the reshape command, and then check the output from reshape to make sure everything went the way it should, and it has. Note that the variable time is dropped, and that there are now three read variables and three math variables."
"Panel/Longitudinal Data
If you have data where units are observed over time, the best predictors of a missing value in one period are likely the values of that variable in the previous and subsequent periods. However, the imputation model can only take advantage of this information if the data set is in wide form (one observation per unit, not one observation per unit per time period). You can convert back to long form after imputing if needed. To convert the data to wide form before imputing, use reshape. To convert back to long form after imputing, use mi reshape. This has the same syntax as reshape, but makes sure the imputations are handled properly. If you're not familiar with reshape, see the Hierarchical Data section of Stata for Researchers."
As the result I got the imputed data for three periods: 2005-2007, 2008-2010, 2011-2012.
My question is:
Can I merge the MI procedure results from the sub-periods (2005-2007, 2008-2010, 2011-2012) into the single period (2005-2012) and go on with my analysis, or must I perform the imputation and panel data analysis on the same intervals (and, hence, perform panel data analysis three times)?
I wasn't able to find the definite answer on the question above. However, all the authors agree that imputation model and analytical model should be parsimonious, but the clear guidelines on the extent of this parsimony are missing .
Thank you!
Comment