Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replace dummy variable value when a variable has a numeric value somewhere in the list of all of another variable's values

    Using STATA 15.1 Windows 10

    Hi All, first-time poster.

    I am attempting to replace a dummy variable (y_dummy) equal to 1 when the value for y1 is anywhere in the values of y0. The values in y1 and y0 are numeric and each has 5,200 distinct values so I've not been able to use inlist(). I am new to loops, but this is the general approach I was trying (with the understanding that I can't use inlist with this many distinct values). As an example, value "123" in y1 also appears in y0 somewhere, so I'd want to replace y_dummy = 1 for the observation where y1 == 123. Thanks in advance for your help.

    Code:
    sum y1
    foreach i = 1/`r(max)' {
        if `i' = inlist(y0, all_values) {
            replace y_dummy = 1
        }
    }
    The data:
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(y1 y0 y_dummy)
     39  26 .
     39  26 .
     39  26 .
     84  63 .
     84  63 .
     82  72 .
     82  72 .
    174  86 .
    174  86 .
    191  89 .
    191  89 .
    191  89 .
    137  98 .
    137  98 .
    301 108 .
    301 108 .
    301 108 .
    123 112 .
    123 112 .
    194 122 .
    194 122 .
    194 122 .
    194 122 .
    194 122 .
    124 123 .
    124 123 .
    140 127 .
    150 133 .
    150 133 .
    150 133 .
    155 139 .
    155 139 .
    175 148 .
    198 157 .
    214 173 .
    214 173 .
    206 178 .
    206 178 .
    206 178 .
    183 180 .
    183 180 .
    270 205 .
    225 224 .
    225 224 .
    226 225 .
    226 225 .
    290 283 .
    290 283 .
    292 291 .
    292 291 .
    end

  • #2
    No loop is needed to solve this problem.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(y1 y0)
     39  26
     39  26
     39  26
     84  63
     84  63
     82  72
     82  72
    174  86
    174  86
    191  89
    191  89
    191  89
    137  98
    137  98
    301 108
    301 108
    301 108
    123 112
    123 112
    194 122
    194 122
    194 122
    194 122
    194 122
    124 123
    124 123
    140 127
    150 133
    150 133
    150 133
    155 139
    155 139
    175 148
    198 157
    214 173
    214 173
    206 178
    206 178
    206 178
    183 180
    183 180
    270 205
    225 224
    225 224
    226 225
    226 225
    290 283
    290 283
    292 291
    292 291
    end
    
    preserve
    keep y0
    duplicates drop
    rename y0 y1
    tempfile y0_values
    save `y0_values'
    
    restore
    merge m:1 y1 using `y0_values', keep(master match)
    gen byte wanted = (_merge == 3)
    does what you want, with one slight improvement. You were in the process of creating a variable which had value 1 for selected observations and missing values for other observations. In Stata that is an invitation to problems. Indicator ("dummy") variables should be set up as 1 for true, 0 (not missing value) for false. You can have missing values in that variable as well, but it should only be used for observations where the data do not enable you to determine whether the condition is true or false in that observation. Having 0/. variables instead of 0/1 is legal in Stata, but the way Stata works with logical variables, you will end up having to do a lot of extra coding to work around the way Stata normally understands logical expressions, and the likelihood of getting that wrong is very high. So make it a habit to code your logical variables with 0/1.

    Comment


    • #3
      Thanks Clyde, this worked perfectly--and thanks for the reminder/advice on dummy creation.

      Comment

      Working...
      X