I'm trying to run -mi impute chained- on a fairly large data set with a large number of variables. The code is too long to post here, but I think the following excerpt includes everything relevant:
Stata runs through things just fine until it gets to the -mlogits-. Then, it performs a bunch of them successfully for both the marital status (3 levels) and work status (4 level) variables.(Output too long to show here.) But then, it aborts with the following output:
I don't know what to make of "too few categories," and I don't know how to proceed to troubleshoot it. Clearly, both marital status and work status have more than 2 levels: they have already been used as the outcomes of mlogit earlier. I suppose that in some of the iterations it might happen that only one or two of the outcome levels are actually instantiated in the estimation sample (though it's a bit surprising as none of the outcomes is particularly rare):
But I can't even start to troubleshoot this, because the output doesn't even tell me which of these outcome variables is implicated in the problem.
Any thoughts on how I can figure this out?
Code:
// IDENTIFY REGULAR VARIABLES AND VARIABLES TO IMPUTE mi set mlong mi set M = 1 // UNTIL WE GET IT WORKING, THEN M = 50 ds /*several_dozen_variables*/ local for_pmm `r(varlist)' ds /*another_bunch_of_count_variables*/ local for_poisson `r(varlist)' ds /*a_few_dichotomies*/ local for_logit `r(varlist)' ds work_status maritalstatus local for_mlogit `r(varlist)' /* DEFINITIONS OF LOCAL MACROS regular passive imputed HERE */ mi register regular `regular' mi register passive `passive' mi register imputed `imputed' mi impute chained /// (pmm, knn(1) noisily) `for_pmm' /// (poisson, noisily iterate(100)) `for_poisson' /// (mlogit, augment noisily iterate(100)) `for_mlogit' /// (logit, augment noisily iterate(100)) `for_logit' /// = `regular', augment report replace force
Code:
Running mlogit on data from iteration 1, m=1: note: ethnicityethn3 omitted because of collinearity too few categories error occurred during imputation of income cage_score sdsworkyessq001 sdssq002 sdssq003 qol_total phq2_score v4_ptsd_level medical_conditions_after_911 rescue_occasions work_status maritalstatus ethn3a trainingsq001 trainingsq002 on m = 1 r(148);
Code:
. mi xeq 0: tab1 maritalstatus work_status m=0 data: -> tab1 maritalstatus work_status -> tabulation of maritalstatus maritalstatus | Freq. Percent Cum. ---------------------------+----------------------------------- Never Married | 603 14.54 14.54 Married/Cohabiting | 3,054 73.63 88.16 Widowed/Divorced/Separated | 491 11.84 100.00 ---------------------------+----------------------------------- Total | 4,148 100.00 -> tabulation of work_status work_status | Freq. Percent Cum. ---------------------+----------------------------------- Working (FT/PT) | 2,639 64.55 64.55 Disabled | 457 11.18 75.73 Retired | 821 20.08 95.82 Unempl/Retired/Other | 171 4.18 100.00 ---------------------+----------------------------------- Total | 4,088 100.00
Any thoughts on how I can figure this out?
Comment