Multiple Imputation with refreshment samples

Marry Lee

Join Date: Nov 2020

Posts: 189
#1

Multiple Imputation with refreshment samples

03 Sep 2024, 08:03

Dear all,

I searched for solutions on the forum but couldnt find any.

I have a panel data with 5 waves where in each wave new people are being newly recruited. For example some people are recruited only starting from wave 3.

I am running a MI for missing data as follows:

Code:

mi reshape wide Y1 Y2 X1 X2 X3, i(n_id) j(cycle) mi register imputed Y1* Y2* X1* mi xtset, clear mi impute chained (regress) Y1* Y2* X1* = X2 X3 , add(20) augment noisily showcommand rseed (12345687) force

The problem is: X2 and X3 do not have missing values before reshaping the data, but because some people are only recruited starting from a given wave, missing values appear for these two variables after reshaping the data. For example if one is recruited in wave 3, X21, X22, X31and X32 (variables X2 and X3 in waves 1 and 2) will be missing for this person.

So I used the force option to tell Stata to ignore this. But now I have another problem :

Code:

mi impute: VCE is not positive definite The posterior distribution from which mi impute drew the imputations for ndvi2506 is not proper when the VCE estimated from the observed data is not positive definite. This may happen, for example, when the number of parameters exceeds the number of observations. Choose an alternate imputation model. error occurred during imputation of ...

Is there a particular way to consider when imputing this kind of datasets, where some people recruited only starting from a particular wave and we do not want to impute the previous waves for these people?
Help me please.

Thank you
Tags: None
daniel klein

Join Date: Mar 2014

Posts: 3842
#2

03 Sep 2024, 14:33

Never use the force option with mi.

Why don't you register the respective variables imputed, move them to the left of the equals sign in mi impute, and impute the respective missing values along with the other missing values? You can still omit the imputed values for the respective observations from the substantive models.
By the way, if you really use regress for all variables, consider using mvn instead; it's much faster.

As for the error message, if you end up with more variables than observations after the reshape, you cannot use all variables in all (conditional) models.
1 like
Comment
Marry Lee

Join Date: Nov 2020

Posts: 189
#3

04 Sep 2024, 05:21

Thank you daniel klein, That is one successful solution.
Comment

Announcement

Multiple Imputation with refreshment samples

Comment

Comment