I'm currently working with a dataset of 233 observations and 300 variables. This data originates from a survey with a great deal of multiple choice questions, which is the reason why there are so relatively many variables. Each possible answer for each multiple choice questions appears as a binary variable in my dataset saying whether that answer was chosen or not. I have provided an example that I asses demonstrates the problem in a fair, although much smaller scaled, manner.
My respondents have, in a multiple choice question, been asked what sodas they've consumed over the last year. For each soda they crossed off, they've been asked to which degree, on an ordinal scale from 1-5, they liked it and similarly how often they buy that soda.
The data, therefore, looks something like this.
I'm aware that I can use the mrtab command in order to gain an overview of the answers, but I desire a new variable, as I wish to make a twoway table with other variables such as the "Like_var".
Thus, I wish to make a overall "category" for sodas. This could be "Cola" and "Orange sodas". But the Cola Citrus would have to figure ind both (!). Now, I can code individual binary variables on whether you drink Cola or Orange.
Upon tabulating the new category variable, and within the process of making it, there is no error message, but the second value of the variable, "orange", does not have all the observations that it should. I assume that each respondent can only appear once in each variable. And as you see several respondents enjoy sodas from both categories. Therefore, my question is:
"How may I code a variable that allow for each respondent to figure more than once, or alternatively, how would you go about analyzing the question of whether there are difference between the categories in regards to how often they are consumed?"
This is even more complicated by the fact that I am not allowed do use regression, but should stick to two, or perhaps, threeway tables.
Additionally, how may I produce one collected "how often do you buy soda"-variable, when this too dispersed over three variables? (I also would want to analyse the relationship between how good a soda is rated and how often it is bought).
Hopeful regards
My respondents have, in a multiple choice question, been asked what sodas they've consumed over the last year. For each soda they crossed off, they've been asked to which degree, on an ordinal scale from 1-5, they liked it and similarly how often they buy that soda.
The data, therefore, looks something like this.
Code:
input byte(Cola Fanta Cola_Citrus) float(Like_Cola Like_Fanta Like_ColaCitrus Buy_Cola Buy_Fanta Buy_ColaCitrus) 1 1 1 1 2 3 1 3 1 . . 1 . . 4 . . 3 1 . . 2 1 . 5 . . 1 1 . 3 3 . 2 3 . 1 . . 4 5 . 1 . . . . 1 . . 2 . . 1 . 1 . . . . . 1 . . 1 1 . . 1 . 1 1 . . . . . . . . . 1 1 . 5 4 . 5 3 . end label values Cola Cola label def Cola 1 "Cola", modify label values Fanta Fanta label def Fanta 1 ".", modify label values Cola_Citrus Cola_Citrus label def Cola_Citrus 1 "Cola Citrus", modify label values Like_Cola ordinal label values Like_Fanta ordinal label values Like_ColaCitrus ordinal label values Buy_Cola ordinal label values Buy_Fanta ordinal label values Buy_ColaCitrus ordinal label def ordinal 1 "Often", modify label def ordinal 2 "Somehow Often", modify label def ordinal 3 "Neither", modify label def ordinal 4 "Rarely", modify label def ordinal 5 "Never", modify
Thus, I wish to make a overall "category" for sodas. This could be "Cola" and "Orange sodas". But the Cola Citrus would have to figure ind both (!). Now, I can code individual binary variables on whether you drink Cola or Orange.
Code:
gen Cola2 =. recode Cola2 (.=1) if Cola==1 | Cola_Citrus ==1 gen Orange=. recode vurdering (.=1) if Fanta==1 | Cola_Citrus==1 *Attempting to make overall category variable gen Category_Var =. recode Category_Var (.=1) if Cola2==1 recode kategorier (.=2) if Orange==1 tab Category_var
"How may I code a variable that allow for each respondent to figure more than once, or alternatively, how would you go about analyzing the question of whether there are difference between the categories in regards to how often they are consumed?"
This is even more complicated by the fact that I am not allowed do use regression, but should stick to two, or perhaps, threeway tables.
Additionally, how may I produce one collected "how often do you buy soda"-variable, when this too dispersed over three variables? (I also would want to analyse the relationship between how good a soda is rated and how often it is bought).
Hopeful regards
Comment