Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Applying a conditional -if- statement to a set of variables without typing the names individually?

    Dear STATA Forum,

    I have a (hopefully simple) question about using variable shortcuts and conditional statements. Suppose I have a series of similarly named variables (var_1-var_6), and they are all have some numerical value inside. Here is some quick reproducible example data:
    Code:
    *fake data
    clear
    input var_1 var_2 var_3 var_4 var_5 var_6
    1 0 0 1 1 1
    0 1 1 0 1 0
    0 0 0 0 1 1
    end
    
    gen country = ""
        replace country = "usa" in 1
        replace country = "can" in 2
        replace country = "sgp" in 3
    For a given country, I would like to know which variables of var_1-var_6 are greater than 0. I realize I can...
    Code:
    mdesc var* if country=="usa"
    ...to infer which variables have values from the percent missing, but is there an alternative way to apply a conditional -if- statement to all of the vars*?

    For those interested (if it will help in providing a solution), the purpose of this question is to return a varlist of each var* that meets a certain level. Those vars that meet a certain level inform a process in a separate dataset. Each varlist is unique by country, such that the varslist considered for USA differs from CAN, SGP, etc.

    Apologies if this question is poorly formatted - please let me know how I can improve it!
    Last edited by Jeffery Sauer; 14 Apr 2019, 20:40. Reason: clarification around suggested solution of mdesc

  • #2
    This is only possible if the variables in question are named in some similar way so that they can be described with wildcards, or form a range of consecutive variables in the data set, so that the this_var-that_var notation encompasses them.

    Does the variable country uniquely identify observations? If so,

    Code:
    //    CREATE A LIST FOR EACH COUNTRY AFTER
    //    VERIFYING COUNTRY UNIQUELY IDENTIFIES OBSERVATIONS
    isid country, sort
    forvalues i = 1/`=_N' {
        local country = country[`i']
        local vlist_`country'
        foreach v of varlist var* {
            if `v'[`i'] > 0 {
                local vlist_`country' `vlist_`country'' `v'
            }
        }
    }
    
    //    DISPLAY THE LISTS SO YOU CAN SEE THAT IT WORKED
    levelsof country, local(countries)
    foreach c of local countries {
        display `"`c'"', `"`vlist_`c''"'
    }
    At the end of the code, local macro vlist_usa will contain the names of those var_* which are > 0 in the usa observation, etc.

    If country does not identify observations uniquely, suppose that var1 = 1 in one observation with country == "whatever" but var1 = 0 in a different observation with country == "whatever", does var1 go into country "whatever"'s list or not?
    Last edited by Clyde Schechter; 14 Apr 2019, 21:10.

    Comment


    • #3

      Hi Clyde,

      Thanks for your rapid response and suggested code! As of now country does not uniquely identify observations. In the dataset's raw form the unit of observation is country-hs6 code (a world trade organization commodity code). However, your code has given me some ideas and I am going to play around with the dataset for a while (possible reshape) and return to you!

      Comment

      Working...
      X