I will start this somewhat lenghty post by repeating my earlier requests on the Wishlist for Stata 17 and the Wishlist for Stata 16.
My problem is with the behavior of mi impute chained when one of the models, usually mlogit, fails to converge. If that happens, Stata will exit with an error message and, more relevant to this post, discard all imputed values that have been added so far. This behavior is consistent with Stata's philosophy of "doing it all or doing nothing at all". It is also useful if there is something wrong with the imputation model that we should fix. The behavior is, however, frustrating if the model in question fails to converge in, say, iteration 7 on m=42. By then, the respective model has successfully converged 416 times (assuming the default burin-in) before it failed -- once. Chances are, there is no systematic problem with that model; chances are, the model will converge again in iteration 8 on m=42.
I argue that stopping the imputation process altogether because one of the models fails to converge once is not only frustrating but also leads to worse imputation models in practice. The reason is that confronted with the described problem, we, as users, are left with one choice: modify the respective model. There are different ways of modifying the model, such as omitting predictors, change (collapse) some categories of the outcome, or use a different model, e.g., pmm. Neither of these modifications is desirable and all of them will necessarily affect all iterations in all imputations, thus, making the imputation model worse. Instead of affecting all iterations in all imputations, I would rather be able to skip the one iteration in which the model happens not to converge.
The community-contributed ice command (Royston, SSC, SJ) offers a persist option that ignores errors, such as non-convergence. It would be even better if we could specify which errors we are willing to ignore and how often we are willing to ignore them. Still, this option is something that StataCorp should seriously consider borrowing. Personally, I trust ice but I am just a little bit more comfortable with Stata's mi suit. Therefore, I have written a crude workaround wrapper for mi impute that persists in case of non-convergence. Here is an example, using a modified version of auto.dta:
The modification above lead to convergence issues when we impute missing values for rep78 with mlogit:
Note that the model has converged 7 times before failing once. Here is how the wrapper, mimpt, works:
I have typed
where the required option skipnonconvergence() specifies how many errors due to non-convergence to ignore. Here, I am willing to ignore 5 such errors. The warning message informs me that the model did not converge 2 times. Had the model failed to converge more than 5 times, the result would have been the same as with mi impute chained: mimpt would have exited with return code r(430) and discarded all imputed values.
The output reveals how mimpt works: it repeatedly calls mi impute and adds 1 complete dataset at a time. If there is an error, the imputation of the respective dataset, say, m=1, is repeated. There are side-effects: the model specification must be repeatedly parsed by mi impute, any warning message (or their absence) of mi impute refers only to the last imputed dataset, any results that mi impute returns in r() hold refer only to the last imputed dataset. All this is to say: mimpt is a workaround that should be used with caution and should be replaced by a respective option in Stata's mi impute command.
For those of you, who have experienced the described problem with non-convergence, who agree with my argument, and who, for whatever reasons, want to stick with mi instead of ice, mimpt is available from the SSC. Thanks, as usual, to Kit Baum.
Best
Daniel
My problem is with the behavior of mi impute chained when one of the models, usually mlogit, fails to converge. If that happens, Stata will exit with an error message and, more relevant to this post, discard all imputed values that have been added so far. This behavior is consistent with Stata's philosophy of "doing it all or doing nothing at all". It is also useful if there is something wrong with the imputation model that we should fix. The behavior is, however, frustrating if the model in question fails to converge in, say, iteration 7 on m=42. By then, the respective model has successfully converged 416 times (assuming the default burin-in) before it failed -- once. Chances are, there is no systematic problem with that model; chances are, the model will converge again in iteration 8 on m=42.
I argue that stopping the imputation process altogether because one of the models fails to converge once is not only frustrating but also leads to worse imputation models in practice. The reason is that confronted with the described problem, we, as users, are left with one choice: modify the respective model. There are different ways of modifying the model, such as omitting predictors, change (collapse) some categories of the outcome, or use a different model, e.g., pmm. Neither of these modifications is desirable and all of them will necessarily affect all iterations in all imputations, thus, making the imputation model worse. Instead of affecting all iterations in all imputations, I would rather be able to skip the one iteration in which the model happens not to converge.
The community-contributed ice command (Royston, SSC, SJ) offers a persist option that ignores errors, such as non-convergence. It would be even better if we could specify which errors we are willing to ignore and how often we are willing to ignore them. Still, this option is something that StataCorp should seriously consider borrowing. Personally, I trust ice but I am just a little bit more comfortable with Stata's mi suit. Therefore, I have written a crude workaround wrapper for mi impute that persists in case of non-convergence. Here is an example, using a modified version of auto.dta:
Code:
version 12.1 // needed for the seed set seed 42 set maxiter 20 // don't want to wait for 16,000 iterations // example data sysuse auto , clear replace mpg = . if runiform()>.6 replace price = . if runiform()>.4 // mi setting mi set mlong mi register imp rep78 mpg price
Code:
. mi impute chained ///
> (mlogit , augment) rep78 ///
> (pmm , knn(3)) mpg price ///
> , add(10) noisily
Conditional models:
rep78: mlogit rep78 mpg price , augment noisily
mpg: pmm mpg i.rep78 price , knn(3) noisily
price: pmm price i.rep78 mpg , knn(3) noisily
Performing monotone imputation, m=1:
Running mlogit on observed data, m=1:
Iteration 0: log likelihood = -93.692061
Iteration 1: log likelihood = -93.692061
Multinomial logistic regression Number of obs = 69
LR chi2(0) = 0.00
Prob > chi2 = .
Log likelihood = -93.692061 Pseudo R2 = 0.0000
------------------------------------------------------------------------------
rep78 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1 |
_cons | -2.70805 .7302967 -3.71 0.000 -4.139406 -1.276695
-------------+----------------------------------------------------------------
2 |
_cons | -1.321756 .3979112 -3.32 0.001 -2.101647 -.5418642
-------------+----------------------------------------------------------------
3 | (base outcome)
-------------+----------------------------------------------------------------
4 |
_cons | -.5108256 .2981424 -1.71 0.087 -1.095174 .0735227
-------------+----------------------------------------------------------------
5 |
_cons | -1.003302 .3524804 -2.85 0.004 -1.694151 -.3124532
------------------------------------------------------------------------------
[...]
Running mlogit on data from iteration 8, m=1:
Iteration 0: log likelihood = -93.692061
Iteration 1: log likelihood = -84.819893
Iteration 2: log likelihood = -81.752821
Iteration 3: log likelihood = -79.824403
Iteration 4: log likelihood = -79.07954
Iteration 5: log likelihood = -78.816167
Iteration 6: log likelihood = -78.665878
Iteration 7: log likelihood = -78.582992
Iteration 8: log likelihood = -78.566297
Iteration 9: log likelihood = -78.562641
Iteration 10: log likelihood = -78.561756
Iteration 11: log likelihood = -78.561571
Iteration 12: log likelihood = -78.561532 (not concave)
Iteration 13: log likelihood = -78.561531 (not concave)
Iteration 14: log likelihood = -78.56153 (not concave)
Iteration 15: log likelihood = -78.56153 (not concave)
Iteration 16: log likelihood = -78.56153 (not concave)
Iteration 17: log likelihood = -78.56153 (not concave)
Iteration 18: log likelihood = -78.56153 (not concave)
Iteration 19: log likelihood = -78.56153 (not concave)
Iteration 20: log likelihood = -78.56153 (not concave)
convergence not achieved
Multinomial logistic regression Number of obs = 69
LR chi2(7) = 30.26
Prob > chi2 = 0.0001
Log likelihood = -78.56153 Pseudo R2 = 0.1615
------------------------------------------------------------------------------
rep78 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1 |
mpg | -19.66078 405.0998 -0.05 0.961 -813.6419 774.3203
price | -.0710612 1.818402 -0.04 0.969 -3.635063 3.492941
_cons | 639.3719 . . . . .
-------------+----------------------------------------------------------------
2 |
mpg | -.089131 .1029034 -0.87 0.386 -.2908178 .1125559
price | .0001489 .0001166 1.28 0.202 -.0000797 .0003774
_cons | -.8377781 2.345939 -0.36 0.721 -5.435735 3.760178
-------------+----------------------------------------------------------------
3 | (base outcome)
-------------+----------------------------------------------------------------
4 |
mpg | .075735 .0570139 1.33 0.184 -.0360101 .1874802
price | .0000827 .0000967 0.85 0.393 -.0001068 .0002722
_cons | -2.645726 1.600045 -1.65 0.098 -5.781756 .4903036
-------------+----------------------------------------------------------------
5 |
mpg | .1692951 .0652532 2.59 0.009 .0414013 .297189
price | .0000554 .0001334 0.42 0.678 -.000206 .0003169
_cons | -5.221147 2.054025 -2.54 0.011 -9.246962 -1.195332
------------------------------------------------------------------------------
Note: 1 observation completely determined. Standard errors questionable.
convergence not achieved
mlogit failed to converge on observed data
error occurred during imputation of rep78 mpg price on m = 1
r(430);
Note that the model has converged 7 times before failing once. Here is how the wrapper, mimpt, works:
Code:
. mimpt chained ///
> (mlogit , augment) rep78 ///
> (pmm , knn(3)) mpg price ///
> , add(10) skipnonconvergence(5)
Conditional models:
rep78: mlogit rep78 mpg price , augment
mpg: pmm mpg i.rep78 price , knn(3)
price: pmm price i.rep78 mpg , knn(3)
Performing chained iterations ...
convergence not achieved
convergence not achieved
mlogit failed to converge on observed data
error occurred during imputation of rep78 mpg price on m = 1
[...]
Conditional models:
rep78: mlogit rep78 mpg price , augment
mpg: pmm mpg i.rep78 price , knn(3)
price: pmm price i.rep78 mpg , knn(3)
Performing chained iterations ...
Multivariate imputation Imputations = 10
Chained equations added = 1
Imputed: m=10 updated = 0
Initialization: monotone Iterations = 10
burn-in = 10
rep78: augmented multinomial logistic regression
mpg: predictive mean matching
price: predictive mean matching
------------------------------------------------------------------
| Observations per m
|----------------------------------------------
Variable | Complete Incomplete Imputed | Total
-------------------+-----------------------------------+----------
rep78 | 69 5 5 | 74
mpg | 44 30 30 | 74
price | 29 45 45 | 74
------------------------------------------------------------------
(complete + incomplete = total; imputed is the minimum across m
of the number of filled-in observations.)
Warning: the sets of predictors of the imputation model vary across
imputations or iterations
Warning: the imputation model failed to converge 2 times
Code:
mimpt ... , skipnonconvergence(5)
The output reveals how mimpt works: it repeatedly calls mi impute and adds 1 complete dataset at a time. If there is an error, the imputation of the respective dataset, say, m=1, is repeated. There are side-effects: the model specification must be repeatedly parsed by mi impute, any warning message (or their absence) of mi impute refers only to the last imputed dataset, any results that mi impute returns in r() hold refer only to the last imputed dataset. All this is to say: mimpt is a workaround that should be used with caution and should be replaced by a respective option in Stata's mi impute command.
For those of you, who have experienced the described problem with non-convergence, who agree with my argument, and who, for whatever reasons, want to stick with mi instead of ice, mimpt is available from the SSC. Thanks, as usual, to Kit Baum.
Best
Daniel

Comment