Mi ; multiple imputation; handling with missing values

Chiara Tasselli

Join Date: Feb 2021

Posts: 119
#1

Mi ; multiple imputation; handling with missing values

14 Dec 2023, 05:43

Hello everyone, I am working with multiple imputations in Stata. I have a continuous variable (Segr_v) where I would like to replace some missing values by estimating them through a regression that uses these predictors (ATECO_2digit NUMERO_COMPLESSIVO SEDE_PROVINCIA_label F_share).

The command I am using is the following:

Code:

mi set wide mi register imputed Segr_v mi impute regress Segr_v ATECO_2digit NUMERO_COMPLESSIVO SEDE_PROVINCIA_label F_share if TODROP_overall_final != 1 & NACF != 1 & SizeOver50 == 1 & duplicates_drop != 1 & ImpresaFEM != 1 & ImpresaM!= 1 , add(1) rseed(1234)

The issue is that the code returns values ranging from -0.76 to 1.18, while the logical range for my variable is [0;1]. Can the command be adjusted to consider this, or do you recommend replacing the excesses with lower and upper bounds? Additionally, could you better explain "add(1)"? Currently, it adds an extra variable (because I have set "wise"), but what would be the utility of including an upper value (eg: add(20))?

Moreover, I am not familiar with strategies for imputing missing values, do you have further suggestions or alternative codes for reaching my goal?

Many thanks in advance for your time.
Wishing you a great weak ahead

Last edited by Chiara Tasselli; 14 Dec 2023, 05:58.
Tags: MI, missing values, multiple imputation
Rich Goldstein

Join Date: Mar 2014

Posts: 4479
#2

14 Dec 2023, 06:00

regress will do this almost always - theoretically this is not a problem but, like you, many people object - use "pmm" instead of regress; see

Code:

h mi impute
1 like
Comment
Chiara Tasselli

Join Date: Feb 2021

Posts: 119
#3

14 Dec 2023, 07:04

Originally posted by Rich Goldstein View Post

regress will do this almost always - theoretically this is not a problem but, like you, many people object - use "pmm" instead of regress; see

Code:

h mi impute

Thank you very much, I just tried it, and the results seem much more reasonable.
once again, many thanks for your help
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4479
#4

14 Dec 2023, 07:41

glad it worked out but note that there can be problems with pmm - in particular, some examples may be "chosen" too often and to help guard against this you should make sure that the number of "nearest neighbors" being drawn from is at least 5 (and 10, as a minimum) may well be better (here, I am referring to the number you placed in the "knn(#)" option)
1 like
Comment
Chiara Tasselli

Join Date: Feb 2021

Posts: 119
#5

03 Jan 2024, 06:46

Originally posted by Rich Goldstein View Post

glad it worked out but note that there can be problems with pmm - in particular, some examples may be "chosen" too often and to help guard against this you should make sure that the number of "nearest neighbors" being drawn from is at least 5 (and 10, as a minimum) may well be better (here, I am referring to the number you placed in the "knn(#)" option)

Once again many thanks for your excellent suggestions.
Comment

Announcement

Mi ; multiple imputation; handling with missing values

Comment

Comment

Comment

Comment