Imputed values are outside the original interval

Stephen Okiya

Join Date: Jun 2025

Posts: 280
#1

Imputed values are outside the original interval

19 Jan 2023, 07:09

Hi Stata Users,
Sorry for asking what may be perceived a conceptual issue but I believe I can have an answer from the forum.
I have a variable that's probability (i.e. 0 - 1) and am trying to impute it using the command

Code:

mi impute chained (regress) pr_attend_imput = urban female_headed hhsize hh_head_no_educ clust_literacy num_children hh_member_formal_empl hh_orphan i.hv024, add(20) by(age)

However, the imputed values are outside the original range and am wondering how I can address this challenge.

Thanks in advance!
Tags: None
Richard Williams

Join Date: Apr 2014

Posts: 4983
#2

19 Jan 2023, 07:38

If, say, the observed variable ranges from .3 to .7 and you get imputed values like .25 or .76, that wouldn't concern me. Such values may be legit even if nobody in the sample had them.

If, however, you get imputed values less than 0 or greater than 1, i.e. impossible values, then I'd be more worried.

It would be nice if there were an mi impute fracreg command, but there isn't.

Perhaps mi impute truncreg would be best? I'm not sure.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
daniel klein

Join Date: Mar 2014

Posts: 3845
#3

19 Jan 2023, 07:40

Predictive mean matching (pmm) might be an alternative to linear regression.
Comment
Stephen Okiya

Join Date: Jun 2025

Posts: 280
#4

19 Jan 2023, 07:55

Richard Williams that's exactly my worry!
Comment
Stephen Okiya

Join Date: Jun 2025

Posts: 280
#5

19 Jan 2023, 08:01

daniel klein thanks so much for the proposal. I have tried

Code:

pmm

and I have 2 questions since am not really familiar with the approach
It seems

Code:

pmm

is not compatible with

Code:

chained

The imputed values are either 0 or 1

Please correct me if this isn't the case
Comment
daniel klein

Join Date: Mar 2014

Posts: 3845
#6

19 Jan 2023, 08:25

1. pmm is perfectly compatible with a chained equations approach. You will have to enclose it in parentheses just like you did with regress in your original post.
2. If the imputed values are always 0 or 1, then you probably do not have a variable that contains probabilities but a binary indicator that is either true or false. In that case, use a logit (or probit) model to impute missing values.

Edit:
By the way, your original post here implies that you are imputing values for only one variable. If so, there is obviously no need for chained equations.

Last edited by daniel klein; 19 Jan 2023, 08:40.
2 likes
Comment
Stephen Okiya

Join Date: Jun 2025

Posts: 280
#7

27 Jan 2023, 03:57

Thanks so much daniel klein for your guidance. I sincerely appreciate
Comment

Announcement

Imputed values are outside the original interval

Comment

Comment

Comment

Comment

Comment

Comment