Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • MI Impute Chained Error: Many "Perfect Predictors"

    I am encountering an error using the mi impute chained command in Stata 14.1 to impute on a dataset with 1500 observations. I typed the following:
    mi impute chained (ologit) guilty age educ income urbanicity (logit) male white black jewish protestant catholic other_christian non_judeochristian republican democrat independent northeast midwest farwest mountain, add(5)
    Those are precisely the variables I will use to estimate "guilty."
    I get this error:
    Performing chained iterations ...
    mi impute logit: perfect predictor(s) detected
    Variables that perfectly predict an outcome were detected when logit
    executed on the observed data. First, specify mi impute's option noisily to
    identify the problem covariates. Then either remove perfect predictors from
    the model or specify mi impute logit's option augment to perform augmented
    regression; see
    The issue of perfect prediction during imputation of
    categorical data
    in [MI] mi impute for details.
    error occurred during imputation of guilty income urbanicity republican democrat
    independent on m = 1

    r(498);
    I get the same error if I include any two or more of those six variables. Those are also the only variables with missing data -- so the whole point of the imputation is to have them predict each other. Eliminating all but one would defeat the purpose.

    I have tried the augment option, but it takes a very long time to run, and this is a program I will need to run repeatedly, so I would like it to be fairly efficient.

    Thank you so much for any advice on how I might fix this problem.
    Last edited by Maggie Wittlin; 27 May 2016, 22:33.

  • #2
    I think there needs to be an equal sign (=) somewhere in your command line, perhaps after guilty. Try writing something like these:

    Impute bmi and age using linear regression
    . mi impute chained (regress) bmi age = attack smokes hsgrad female, add(10)

    Impute bmi using predictive mean matching and age using linear regression
    . mi impute chained (pmm) bmi (regress) age = attack smokes hsgrad female, replace

    Comment


    • #3
      Thanks, Cyrus.

      I've gotten the program to work. I'm not wholly sure what made the difference, but I changed three things: (1) closed and re-opened Stata, (2) changed two "ologit" variables (education and income) to "regress," since they have a large number of categories, and (3) added "augment" to the logit regression. Now it functions well.

      I didn't need the equal sign -- I'm doing chained imputations so I can fill in each of the variables based on the others, so I don't think I should be distinguishing them from each other in any way.

      Comment


      • #4
        Maggie

        I'm sure you have recognised the main problem: when imputing one or more of your categorical variables there were cells with a count of zero, meaning one of the imputation models estimates an infinite parameter. Augment aims to fix this, but doesn't always. By further changing the imputation method to -regress-, it reduces the number of parameters being estimated, thus improving things further.

        There is a better approach for educ and income: the -ascontinuous- option for use with -mi impute ologit-. This tells -mi impute chained- to impute the variables as ordered categorical but when being used to impute other variables will be regarded as continuous covariates. The code is then
        Code:
        mi impute chained (ologit) guilty age urbanicity (ologit, ascontinuous) educ income (logit) ... , add(5)
        Note also imputing via pmm as another possibility (always imputed observable values, unlike regress).

        In your original message you noted that you only want to impute six variables. Which six? All others should be included to the RHS of '=', as Cyrus notes.

        Tim

        Comment

        Working...
        X