Dear members,
I have tried to find my answer, but I could not find enough material to be sure about this issue.
I have a survey dataset in which one of the variables is a multiple response question with no order - the respondent can answer "yes" or "no" to the six options, choosing as many as applied in her case.
I expect this question to become my dependent variable and obviously I cannot apply a multinomial directly to the data as is.
After reading some posts on old and new statalist as well as Cox and Kohler's "Speaking Stata: On structure and shape: the case of multiple responses", I am still puzzled.
First, the variable treatment. I found two solutions that seem quick, feasible, and rather easy to interpret after the regression.
1) concatenate the answers to get responses like "10010".
2) "reshape to long"
The first one will give me, in practice, categories related to all the possible combinations, if I understood well. I am a little concerned about it giving me too many categories.
The second one creates a "false panel", as it was pointed out somewhere. What approach would you take here?
As for the "post variable treatment": can I analyze the data with the regular protocols/models available in Stata? Do I need to pay attention to something else?
I hope I was clear in my questions, but if I was not, I'll be happy to clarify them.
Best
I have tried to find my answer, but I could not find enough material to be sure about this issue.
I have a survey dataset in which one of the variables is a multiple response question with no order - the respondent can answer "yes" or "no" to the six options, choosing as many as applied in her case.
I expect this question to become my dependent variable and obviously I cannot apply a multinomial directly to the data as is.
After reading some posts on old and new statalist as well as Cox and Kohler's "Speaking Stata: On structure and shape: the case of multiple responses", I am still puzzled.
First, the variable treatment. I found two solutions that seem quick, feasible, and rather easy to interpret after the regression.
1) concatenate the answers to get responses like "10010".
2) "reshape to long"
The first one will give me, in practice, categories related to all the possible combinations, if I understood well. I am a little concerned about it giving me too many categories.
The second one creates a "false panel", as it was pointed out somewhere. What approach would you take here?
As for the "post variable treatment": can I analyze the data with the regular protocols/models available in Stata? Do I need to pay attention to something else?
I hope I was clear in my questions, but if I was not, I'll be happy to clarify them.
Best
Comment