I have exported data from a survey software which has resulted in data being stored in a strange manner.particularly for multiple choice questions, instead of creating one variable for each option, it has stored all options in one variables separated by "space". Here's an example
Here c26_1_1 is, for example, "main sources of irrigation", and the subsequent columns correspond to each of the options.- they should be 1, if the option was selected in c26_1_1 and 0 otherwise. For a subset of the sample, the columns have been filled as they should-- (see last few rows of example), for the others the c26_1_1 option is stored as, for example, "1 4 99" when option 1,4 and 99 were selected, and subsequently c26_1_1_1, c26_1_1_4 and c26_1_1_99 are blank when they should be 1. similarly, if only option 4 was selected, c26_1_1 should be 4 and c26_1_1_4 should be 1.
Right now I am splitting the variables c26_1_1 and storing it into a temp variable and then using those temp variables to fill in the options (1/0)- but I want to to find a more efficient way to do this, since I have many such multiple choice variables in my dataset that have been stored in a weird manner.
Alternatively, if there is a way to split the variables and create new columns for each of the options that is binary if the option is selected, that would be great as well. That is, instead of pre-specifying c_26_1_x where x=1,2,3,4,99, if c26_1_1 could be split and automatically create a new column for each of the options..
any suggestions?
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str4 c26_1_1 str1(c26_1_1_1 c26_1_2_1 c26_1_3_1 c26_1_4_1 c26_1_5_1 c26_1_6_1 c26_1_99_1) "" "" "" "" "" "" "" "" "4" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "2" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "1" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "1 99" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "4" "0" "0" "0" "1" "0" "0" "0" "" "" "" "" "" "" "" "" "4" "0" "0" "0" "1" "0" "0" "0" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" end
Right now I am splitting the variables c26_1_1 and storing it into a temp variable and then using those temp variables to fill in the options (1/0)- but I want to to find a more efficient way to do this, since I have many such multiple choice variables in my dataset that have been stored in a weird manner.
Alternatively, if there is a way to split the variables and create new columns for each of the options that is binary if the option is selected, that would be great as well. That is, instead of pre-specifying c_26_1_x where x=1,2,3,4,99, if c26_1_1 could be split and automatically create a new column for each of the options..
any suggestions?
Comment