Combing dichotomous into one

Rasmus Bang Borup

Join Date: May 2023

Posts: 5
#1

Combing dichotomous into one

11 May 2023, 23:19

Hello Statalist

I have troubles with my data and i hope you can help me
My problem is that i have conducted a conjoint dataset. My total number of observations is 1251. In this dataset i have three rounds of 15 different dichotomous variables and that makes 45 diiferent dichotomous variabels. All respondets have answered three different of these. i have reshaped the dataset from long to wide and back to long again so my total numbers of observations increased to 1251x3 =3753. My problem is when i try to combine all my dichotomous variables into one combined dichotomous variabel my total number of observations is 1251 and not 3753. I think the problem is the way stata treats missing values. Because my 1251 observations have answered 3/45 different dichotomous variables, they are treated as missing values in the rest of the variables.

I have tried the replace command:

gen var1=.
replace var1=0 if var21 !=. & var22==0 | var22 !=. & var21==0 .... and so on but it doesn't work

egen var1, totalrow(varlist) will not work in my case either because i want my final variabel to be dichotomous aswell

I hope i have given you enough information to help me with me situation

Best regards

Rasmus
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35783
#2

12 May 2023, 01:14

I find it difficult to follow exactly what you want here but here's a try.

You want to ignore missings and to combine whatever is 1 or 0 into a new dichotomous (binary, zero-one, indicator, dummy, Boolean, ...) variable.

I can think of two useful ways to do that with egen, to use rowmax() to get whether any answer is 1 and rowmin() to get whether all answers are 1.

This data example is sufficient to show that these functions ignore missings.

Code:

* Example generated by -dataex-. For more info, type help dataex clear input float(var1 var2 var3 var4) 1 1 1 . 1 1 0 . 1 0 1 . 1 0 0 . 0 1 1 . 0 1 0 . 0 0 1 . 0 0 0 . end . egen any1 = rowmax(var*) . egen all1 = rowmin(var*) . l +-----------------------------------------+ | var1 var2 var3 var4 any1 all1 | |-----------------------------------------| 1. | 1 1 1 . 1 1 | 2. | 1 1 0 . 1 0 | 3. | 1 0 1 . 1 0 | 4. | 1 0 0 . 1 0 | 5. | 0 1 1 . 1 0 | |-----------------------------------------| 6. | 0 1 0 . 1 0 | 7. | 0 0 1 . 1 0 | 8. | 0 0 0 . 0 0 | +-----------------------------------------+

With a long layout, you could still use max() and min() together with by:
.

Otherwise I can't see how any combined variable can also be dichotomous. You could concatenate whatever is not missing, but that is a different story.

Also rowtotal() -- not totalrow() -- does ignore missings to the extent possible, but if three (0, 1) variables are summed the result will be 0, 1, 2 or 3.
Comment
Rasmus Bang Borup

Join Date: May 2023

Posts: 5
#3

12 May 2023, 01:48

My data looks similar to this.

i have 45 variables instead of 6. These have been conducted in three rounds in the same survey my data is based upon. So fx all of my observation have a value in 1 of the first 15 variables, and 1 in the 15/30 variables and 1 in 30/45 variables. So i can combine var1-var15 into one binary variabel. the same for var15-30 and 30-45. My problem is when i want to combine all my 45 variables into 1. so to be clear i want a variabel where all who have the value 0 in 1/45 variables is equal to zero and all who have the value 1 in 1/45 variables is equal to 1. I don't know if this is possible or not
When using fx egen rowmax(var1 ... var45) or egen rowmin(var1...var45), stata will automatical code 2/3 of my dataset to missing and the i'm not sure about the logic for the last 1/3 of my dataset

I hope this information is useful
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35783
#4

12 May 2023, 02:44

Sorry, but I can't fully follow this either. Sure, you have 45 variables, not 6, but you don't give and I don't need to produce an example with 45 to show the principles.

Wanting 1 if any single variable is 1 and 0 if any single variable is 0 are contradictory rules, useless for any mix of 1s and 0s. That is what you seem to be asking with

all who have the value 0 in 1/45 variables is equal to zero and all who have the value 1 in 1/45 variables is equal to 1

The only circumstance in which rowmin() and rowmax() return missing is if all the values fed to them are missing. Just one non-missing value is sufficient to produce a non-missing result. Here is a demonstration:

Code:

. clear . set obs 1 Number of observations (_N) was 0, now 1. . gen var1 = 1 . forval j = 2/45 { gen var`j' = . } . egen min = rowmin(var*) . egen max = rowmax(var*) . l min max +-----------+ | min max | |-----------| 1. | 1 1 | +-----------+

.
Comment

Announcement

Combing dichotomous into one

Comment

Comment

Comment