Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating an indicator variable from multiple, non-mutually exclusive, dummy variables

    Hi! I'm trying to create an indicator variable from several dummy variables (which are not mutually exclusive). It would seem straightforward, but I am encountering some challenges. Could someone pls advise?

    Here is a simplified example of what I am trying to do. Let's say we have three dummy variables for food preferences, A1=1 if respondent likes cereal for breakfast, 0 if not / A2=1 if respondent likes salad for lunch, 0 if not / A3=1 if respondent eats desert with dinner, 0 if not. The respondent could like all three, or some combination thereof.

    How would I create one variable to catch all? For instance a new variable A, which is =1 if the respondents like breakfast, is =2 if they like salad, and =3 if they like desert, that will not drop those observations that overlap?

  • #2
    Hello and welcome,

    How would you resolve the situation where A1 is 1 and A2 is 1 and A3 is also 1? Wouldn't that mean A has to be 1, 2, and 3 at the same time?

    Also, if at all possible, consult the FAQ (https://www.statalist.org/forums/help) and provide some sample data using -dataex-.

    There can be many ways to deal with this. Here is one that I usually use: chaining them up into a long character of 1/0:

    Code:
    * Creating some fake data for illustration:
    clear
    set seed 92272
    set obs 10
    gen A1 = runiform() > .5
    gen A2 = runiform() > .5
    gen A3 = runiform() > .5
    
    * Creating a compound string variable:
    egen A = concat(A1 A2 A3)
    label variable A "Breakfast, Salad, Desert"
    tab A
    Results:
    Code:
     Breakfast, |
         Salad, |
         Desert |      Freq.     Percent        Cum.
    ------------+-----------------------------------
            000 |          2       20.00       20.00
            001 |          3       30.00       50.00
            010 |          1       10.00       60.00
            011 |          2       20.00       80.00
            110 |          1       10.00       90.00
            111 |          1       10.00      100.00
    ------------+-----------------------------------
          Total |         10      100.00
    And if you prefer them to have a numerical code, then the new variable A should take on 2 x 2 x 2 = 8 values, for all possible 1/0 combinations across the three variables. If you can tell us what you plan to do with the combined variable, we may be able to point you to a closer-to-right direction.
    Last edited by Ken Chui; 08 Apr 2021, 11:41.

    Comment

    Working...
    X