Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to Avoid too many loops and operations with egen

    Dear all,

    I frequently write code that looks like the below -- however this tends to be very slow and I wonder if there is a better way. Fundamentally, I have two related problems

    1. I don't know how to avoid going through the data observation by observation when it requires something a bit more complicated such as the average of all other districts here. So my question is --- How to use bys better so that I can rewrite the nested loops with bysort?
    2. I seemingly spend an eternity generating empty variables, using egen to create a temporary variable , using egen again to copy the values across all observations, say with max, and then replacing the original empty variable. So my second question is how do I avoid this : egen tempvar1=max(var), replace realvar=tempvar if x==y construct?

    Best wishes,

    Stuart



    cap drop jk_median
    cap drop jk_mean
    gen jk_median=.
    gen jk_mean=.

    levelsof year if avg_income_2!=., local(years)
    foreach y of local years {
    levelsof state if year==y, local(states)
    foreach S of local states {
    levelsof district if year==y & state=="`S'", local(districts)
    foreach d of local districts {
    tempvar median mean maxmed maxmean
    egen `median'=median(avg_income_2) if district!="`d'" & state=="`S'" & year==`y'
    egen `mean'=mean(avg_income_2) if district!="`d'" & state=="`S'" & year==`y'
    egen `maxmed'=max(`median') if state=="`S'" & year==`y' & `median'!=.
    egen `maxmean'=max(`mean') if state=="`S'" & year==`y' & `mean'!=.
    replace jk_median=`maxmed' if district=="`d'" & state=="`S'" & year==`y'
    replace jk_mean=`maxmean' if district=="`d'" & state=="`S'" & year==`y'
    }
    }
    }

  • #2
    http://www.stata.com/support/faqs/da...ng-properties/

    Some further tricks in http://www.stata-journal.com/article...article=dm0075

    You (almost) never need to do this

    Code:
    egen tempvar1=max(var)
    and you should always considering doing this

    Code:
    su var, meanonly
    and then using r(max)

    Comment


    • #3
      Using r(max) has sped things up tremendously. Thanks very much.

      Comment

      Working...
      X