I have wide student test data. Each row is an individual student, with variables/columns as responses to individual questions. I want to eliminate any variable for which there are no more than 5 valid observations (ie, too many missing values). Suggestions? I'm new to "missing" but might that do the trick?
EDIT: This is in preparation for IRT. My real problem is that some questions have only 1 or 2 kids answering, which creates insufficient variation and results in error message "v12345 does not vary in the estimation sample." If there's a better way than dropping v12345, I'm all ears.
Thanks,
Bryan
EDIT: This is in preparation for IRT. My real problem is that some questions have only 1 or 2 kids answering, which creates insufficient variation and results in error message "v12345 does not vary in the estimation sample." If there's a better way than dropping v12345, I'm all ears.
Thanks,
Bryan

Comment