Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combing dichotomous into one

    Hello Statalist

    I have troubles with my data and i hope you can help me
    My problem is that i have conducted a conjoint dataset. My total number of observations is 1251. In this dataset i have three rounds of 15 different dichotomous variables and that makes 45 diiferent dichotomous variabels. All respondets have answered three different of these. i have reshaped the dataset from long to wide and back to long again so my total numbers of observations increased to 1251x3 =3753. My problem is when i try to combine all my dichotomous variables into one combined dichotomous variabel my total number of observations is 1251 and not 3753. I think the problem is the way stata treats missing values. Because my 1251 observations have answered 3/45 different dichotomous variables, they are treated as missing values in the rest of the variables.

    I have tried the replace command:

    gen var1=.
    replace var1=0 if var21 !=. & var22==0 | var22 !=. & var21==0 .... and so on but it doesn't work

    egen var1, totalrow(varlist) will not work in my case either because i want my final variabel to be dichotomous aswell

    I hope i have given you enough information to help me with me situation

    Best regards

    Rasmus




  • #2
    I find it difficult to follow exactly what you want here but here's a try.

    You want to ignore missings and to combine whatever is 1 or 0 into a new dichotomous (binary, zero-one, indicator, dummy, Boolean, ...) variable.

    I can think of two useful ways to do that with egen, to use rowmax() to get whether any answer is 1 and rowmin() to get whether all answers are 1.

    This data example is sufficient to show that these functions ignore missings.


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(var1 var2 var3 var4)
    1 1 1 .
    1 1 0 .
    1 0 1 .
    1 0 0 .
    0 1 1 .
    0 1 0 .
    0 0 1 .
    0 0 0 .
    end
    . egen any1 = rowmax(var*)
    
    . egen all1 = rowmin(var*)
    
    . l
    
         +-----------------------------------------+
         | var1   var2   var3   var4   any1   all1 |
         |-----------------------------------------|
      1. |    1      1      1      .      1      1 |
      2. |    1      1      0      .      1      0 |
      3. |    1      0      1      .      1      0 |
      4. |    1      0      0      .      1      0 |
      5. |    0      1      1      .      1      0 |
         |-----------------------------------------|
      6. |    0      1      0      .      1      0 |
      7. |    0      0      1      .      1      0 |
      8. |    0      0      0      .      0      0 |
         +-----------------------------------------+
    With a long layout, you could still use max() and min() together with by:
    .

    Otherwise I can't see how any combined variable can also be dichotomous. You could concatenate whatever is not missing, but that is a different story.

    Also rowtotal() -- not totalrow() -- does ignore missings to the extent possible, but if three (0, 1) variables are summed the result will be 0, 1, 2 or 3.

    Comment


    • #3
      My data looks similar to this.
      Click image for larger version

Name:	Udklip.PNG
Views:	1
Size:	4.7 KB
ID:	1713381


      i have 45 variables instead of 6. These have been conducted in three rounds in the same survey my data is based upon. So fx all of my observation have a value in 1 of the first 15 variables, and 1 in the 15/30 variables and 1 in 30/45 variables. So i can combine var1-var15 into one binary variabel. the same for var15-30 and 30-45. My problem is when i want to combine all my 45 variables into 1. so to be clear i want a variabel where all who have the value 0 in 1/45 variables is equal to zero and all who have the value 1 in 1/45 variables is equal to 1. I don't know if this is possible or not
      When using fx egen rowmax(var1 ... var45) or egen rowmin(var1...var45), stata will automatical code 2/3 of my dataset to missing and the i'm not sure about the logic for the last 1/3 of my dataset

      I hope this information is useful

      Comment


      • #4
        Sorry, but I can't fully follow this either. Sure, you have 45 variables, not 6, but you don't give and I don't need to produce an example with 45 to show the principles.

        Wanting 1 if any single variable is 1 and 0 if any single variable is 0 are contradictory rules, useless for any mix of 1s and 0s. That is what you seem to be asking with

        all who have the value 0 in 1/45 variables is equal to zero and all who have the value 1 in 1/45 variables is equal to 1
        The only circumstance in which rowmin() and rowmax() return missing is if all the values fed to them are missing. Just one non-missing value is sufficient to produce a non-missing result. Here is a demonstration:

        Code:
         
        . clear
        
        . set obs 1
        Number of observations (_N) was 0, now 1.
        
        . gen var1 = 1
        
        . forval j = 2/45 {
         gen var`j' = .
         }
        
        .  egen min = rowmin(var*)
        
        . egen max = rowmax(var*)
        
        . l min max
        
             +-----------+
             | min   max |
             |-----------|
          1. |   1     1 |
             +-----------+
        .


        Comment

        Working...
        X