Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Nested loop to generate new variables

    Is there some way I could use nested loop to recode the following, where one variable loops between 2014 and 2016, and the other variable loops between mean and sd?

    Code:
        by villageid_2014, sort: egen village_income_mean_2014=mean(income_current_2014)
        by villageid_2016, sort: egen village_income_mean_2016=mean(income_current_2016)
        by villageid_2014, sort: egen village_income_sd_2014=sd(income_current_2014)
        by villageid_2016, sort: egen village_income_sd_2016=sd(income_current_2016)
    I mean something like the following, although I know it is way off (I'm just learning to loop on Stata).

    Code:
        local i 2014 2016
        local j mean sd
        foreach var in `i' {
            foreach var in `j' {
                by villageid_`i', sort: egen vdc_income_`j'=`j'(income_current_`i')
            }
        }
    Thanks!

  • #2
    Well your code is almost right.

    Code:
    local years 2014 2016
    local stats mean sd
    
    foreach y of local years {
        foreach s of local stats {
            by villageid_`y', sort: egen village_income_`s'_`y' = `s'(income_current_`y')
        }
    }
    A couple of tips on programming style that make it easier to write correct code, and also make it easier to understand what code is doing when you come back and read it sometime later:

    1. Give variables, local macros, etc. names that suggest what they are. Naming a local macro i or j doesn't give you any idea what it's being used for.

    2. Give related variables and macros related names. Using y to represent a local macro that iterates of years is suggestive. Using var is not. In fact, it's worse than that: the name var suggests it's a variable, but it isn't.

    3. Give variables and macros that need to be distinguished from each other distinctive names. So don't use `i' and `j', which are easily confused with each other, to represent years and statistical functions in this code, because it's too easy to switch them by mistake when you write code.

    3. Be generous with whitespace. Don't crowd things together when they don't have to be.

    4. When it is possible to use -foreach x of local ...- or -foreach x of numlist ...- or -foreach x of varlist ...- do so in preference to -foreach x in ...-. In addition to making the code easier for human readers to understand, it also runs more quickly (though the speed difference is only noticeable in loops with a really large number of iterations.).

    Comment


    • #3
      Great! I wasn't far too away from that at all! Thanks, again, Clyde, for your wonderful answers as always. I will definitely take these points into consideration.

      Comment


      • #4
        Hi, I have one more query.

        Can you still use a loop to cumulative sum multiple values?

        Case in point:
        Code:
        gen var_idx_2014=((a_idx_2014*a_weight_2014)+(b_idx_2014*b_weight_2014)+(c_idx_2014*c_weight_2014)+(d_idx_2014*d_weight_2014))/20
        gen var_idx_2016=((a_idx_2016*a_weight_2016)+(b_idx_2016*b_weight_2016)+(c_idx_2016*c_weight_2016)+(d_idx_2016*d_weight_2016))/20
        Here, var_idx_2014 and var_idx_2016 are the weighted averages of a, b, c and d, based on their respective weights (the sum equals to 20).

        I can use a loop for the year (2014 and 2016), which will take the following form:
        Code:
        local year 2014 2016
        
        foreach y of local year {
            gen var_idx_`y'=((a_idx_`y'*a_weight_`y')+(b_idx_`y'*b_weight_`y')+(c_idx_`y'*c_weight_`y')+(d_idx_`y'*d_weight_`y'))/20
        }
        But, instead of adding those four terms like that, can looping loop between a, b, c, and d, while cumulatively adding (x_idx_`y'`*x_weight_`y'), where x=a, b, c, d?

        I'm just curious. Thanks!

        Comment


        • #5
          Code:
          quietly foreach y in 2014 2016 {
              gen var_idx_`y'= 0
              foreach x in a b c d {
                  replace var_idx_`y' = var_idx_`y' + `x'_idx_`y' * `x'_weight_`y'
              }
              replace var_idx_`y' = var_idx_`y' / 20
          }

          Comment


          • #6
            Nice! Great! Thanks a lot, Nick!

            Comment

            Working...
            X