Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Imputed values are outside the original interval

    Hi Stata Users,
    Sorry for asking what may be perceived a conceptual issue but I believe I can have an answer from the forum.
    I have a variable that's probability (i.e. 0 - 1) and am trying to impute it using the command
    Code:
    mi impute chained (regress) pr_attend_imput = urban female_headed hhsize hh_head_no_educ clust_literacy num_children hh_member_formal_empl hh_orphan i.hv024, add(20) by(age)
    However, the imputed values are outside the original range and am wondering how I can address this challenge.

    Thanks in advance!

  • #2
    If, say, the observed variable ranges from .3 to .7 and you get imputed values like .25 or .76, that wouldn't concern me. Such values may be legit even if nobody in the sample had them.

    If, however, you get imputed values less than 0 or greater than 1, i.e. impossible values, then I'd be more worried.

    It would be nice if there were an mi impute fracreg command, but there isn't.

    Perhaps mi impute truncreg would be best? I'm not sure.
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    StataNow Version: 19.5 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment


    • #3
      Predictive mean matching (pmm) might be an alternative to linear regression.

      Comment


      • #4
        Richard Williams that's exactly my worry!

        Comment


        • #5
          daniel klein thanks so much for the proposal. I have tried
          Code:
          pmm
          and I have 2 questions since am not really familiar with the approach
          1. It seems
            Code:
            pmm
            is not compatible with
            Code:
            chained
          2. The imputed values are either 0 or 1
          Please correct me if this isn't the case

          Comment


          • #6
            1. pmm is perfectly compatible with a chained equations approach. You will have to enclose it in parentheses just like you did with regress in your original post.
            2. If the imputed values are always 0 or 1, then you probably do not have a variable that contains probabilities but a binary indicator that is either true or false. In that case, use a logit (or probit) model to impute missing values.

            Edit:
            By the way, your original post here implies that you are imputing values for only one variable. If so, there is obviously no need for chained equations.
            Last edited by daniel klein; 19 Jan 2023, 08:40.

            Comment


            • #7
              Thanks so much daniel klein for your guidance. I sincerely appreciate

              Comment

              Working...
              X