Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dealing with missing data when creating a dummy using loop?

    Hi,

    I am curious to know how we handle missing information when we create a dummy variable using multiple variables. For example, I have four variables taking a value of 1 if the person owns it and 0 if the person does not. Following some of the online guides, I have come across this:

    Code:
    gen dum1 = 0
    foreach var of varlist C1 C2 C3 C4 {
    replace dum1 = 1 if `var' == 1
    }
    This completely ignores the missing values if I want to create 2 dummy variables taking a value of 1 if the household owns anyone of the four things and the second dummy taking a value of 1 if the household owns all the 4 things.

    An example of the data set using dataex:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int(C1 C2 C3 C4)
    0 0 0 1
    0 0 0 1
    0 0 0 1
    0 0 0 1
    0 0 0 1
    0 0 0 0
    0 0 0 0
    0 0 0 0
    0 0 0 0
    0 0 0 0
    0 0 0 1
    0 0 1 1
    0 0 1 1
    0 0 0 0
    0 0 0 1
    0 0 0 1
    0 1 1 1
    0 0 0 1
    0 0 0 1
    0 0 0 1
    0 0 0 1
    0 1 1 1
    0 1 0 0
    0 0 0 1
    0 1 0 0
    0 0 0 0
    1 1 0 1
    0 1 0 1
    0 1 1 1
    0 0 0 1
    0 0 0 0
    0 0 0 0
    0 0 0 1
    0 0 0 0
    0 0 0 1
    0 0 0 0
    0 1 0 1
    0 0 0 1
    0 0 0 1
    0 0 0 0
    0 0 0 1
    0 0 0 1
    0 0 1 1
    0 0 0 1
    0 0 1 0
    0 0 0 1
    1 0 0 1
    0 0 0 1
    0 0 0 0
    0 0 0 0
    0 0 0 0
    0 0 0 0
    0 0 0 1
    0 0 0 0
    0 0 0 1
    0 0 1 1
    0 0 0 0
    0 1 0 0
    0 1 0 1
    0 0 0 1
    0 0 0 1
    0 0 0 1
    0 0 0 0
    0 1 0 1
    0 0 1 1
    0 0 0 1
    0 0 0 1
    0 0 0 1
    0 0 1 0
    0 0 0 0
    0 0 0 1
    0 0 1 0
    0 0 0 1
    0 0 1 1
    0 0 1 1
    0 0 0 1
    0 0 0 1
    0 0 0 1
    0 0 0 1
    0 0 1 1
    0 0 0 1
    0 0 0 1
    0 0 0 1
    0 0 0 0
    1 1 0 1
    0 0 0 1
    0 0 1 1
    0 0 0 0
    0 0 0 0
    0 1 0 0
    0 0 0 0
    0 0 0 1
    0 0 0 1
    0 0 0 1
    0 0 1 1
    0 0 0 1
    0 0 0 1
    0 0 0 1
    0 0 0 1
    0 0 0 1
    end
    label values C1 C1
    label def C1 0 "No 0", modify
    label def C1 1 "Yes 1", modify
    label values C2 C2
    label def C2 0 "No 0", modify
    label def C2 1 "Yes 1", modify
    label values C3 C3
    label def C3 0 "No 0", modify
    label def C3 1 "Yes 1", modify
    label values C4 C4
    label def C4 0 "No 0", modify
    label def C4 1 "Yes 1", modify

  • #2
    Jose, the code below may not be the most efficient, but it clearly shows the logic.

    Code:
    gen dum1 = 1 if C1==1 | C2==1 | C3==1 | C4==1
    replace dum1 = 0 if C1==0 & C2==0 & C3==0 & C4==0
    
    gen dum2 = 1 if C1==1 & C2==1 & C3==1 & C4==1
    replace dum2 = 0 if C1==0 | C2==0 | C3==0 | C4==0

    Comment


    • #3
      Code:
      gen any = max(C1, C2, C3, C4)
      will ignore missings to the extent possible and deliver 1 if any value is 1; 0 if otherwise any value is 0; and missing if all values are missing.

      Code:
      gen all = (C1 + C2 + C3 + C4) == 4 
      will deliver 1 if and only if all values are 1 (and 0 if any value is 0 or missing).

      Comment

      Working...
      X