Hi,
I have two questions concerning the imputation (techniques) of multilevel data. I have a multilevel data set (students in classes, in schools, in areas, in districts) and would like to impute variables on each of the four higher levels. Mo solution thus far has been:
-imputing all variables in one common model (wide format):
mi impute chained (pmm, knn(5)) varl1 varl1 varl1 varl2 varl2 varl2 varl3 varl3 varl3 varl4 varl4 varl4 = varl1 varl1 varl1, add(15) noisily rseed(52312)
- and then calculating the mean (metric variables), or the median (categorical variables) across imputations for all higher levels (in case imputed values differ on these levels):
foreach x of numlist 1/15 {
egen _`x'_foo= median(_`x'_varl2), by(level2)
replace _`x'_varl2= _`x'_foo
drop _`x'_foo
egen _`x'_foo= mean(_`x'_varl2), by(level2)
replace _`x'_varl2= _`x'_foo
drop _`x'_foo
.......
egen _`x'_foo= median(_`x'_varl3), by(level3)
replace _`x'_varl3= _`x'_foo
drop _`x'_foo
egen _`x'_foo= mean(_`x'_varl3), by(level3)
replace _`x'_varl3= _`x'_foo
drop _`x'_foo
...
egen _`x'_foo= median(_`x'_varl4), by(level4)
replace _`x'_varl2= _`x'_foo
drop _`x'_foo
egen _`x'_foo= mean(_`x'_varl4), by(level4)
replace _`x'_varl4= _`x'_foo
drop _`x'_foo
}
The commands are all working and I can run analyses, which I do with:
mi est, post noisily cmdok: gllamm DV varl1 varl2 varl3 varl4, i(level2 level3 level4) link(logit) f(binom)
However, I´m not sure if that procedure is correct or can adequately address the structure in my data. Does anybody have any thoughts on wether that procedure is ok? And as a follow-up question: is it ok that I force mi est to rum gllamm by using the cmdok option (as an .ado gllamm would otherwise not run with mi est).
I appreciate all thoughts. Thank you,
Julia
I have two questions concerning the imputation (techniques) of multilevel data. I have a multilevel data set (students in classes, in schools, in areas, in districts) and would like to impute variables on each of the four higher levels. Mo solution thus far has been:
-imputing all variables in one common model (wide format):
mi impute chained (pmm, knn(5)) varl1 varl1 varl1 varl2 varl2 varl2 varl3 varl3 varl3 varl4 varl4 varl4 = varl1 varl1 varl1, add(15) noisily rseed(52312)
- and then calculating the mean (metric variables), or the median (categorical variables) across imputations for all higher levels (in case imputed values differ on these levels):
foreach x of numlist 1/15 {
egen _`x'_foo= median(_`x'_varl2), by(level2)
replace _`x'_varl2= _`x'_foo
drop _`x'_foo
egen _`x'_foo= mean(_`x'_varl2), by(level2)
replace _`x'_varl2= _`x'_foo
drop _`x'_foo
.......
egen _`x'_foo= median(_`x'_varl3), by(level3)
replace _`x'_varl3= _`x'_foo
drop _`x'_foo
egen _`x'_foo= mean(_`x'_varl3), by(level3)
replace _`x'_varl3= _`x'_foo
drop _`x'_foo
...
egen _`x'_foo= median(_`x'_varl4), by(level4)
replace _`x'_varl2= _`x'_foo
drop _`x'_foo
egen _`x'_foo= mean(_`x'_varl4), by(level4)
replace _`x'_varl4= _`x'_foo
drop _`x'_foo
}
The commands are all working and I can run analyses, which I do with:
mi est, post noisily cmdok: gllamm DV varl1 varl2 varl3 varl4, i(level2 level3 level4) link(logit) f(binom)
However, I´m not sure if that procedure is correct or can adequately address the structure in my data. Does anybody have any thoughts on wether that procedure is ok? And as a follow-up question: is it ok that I force mi est to rum gllamm by using the cmdok option (as an .ado gllamm would otherwise not run with mi est).
I appreciate all thoughts. Thank you,
Julia
Comment