Dear all,
I frequently write code that looks like the below -- however this tends to be very slow and I wonder if there is a better way. Fundamentally, I have two related problems
1. I don't know how to avoid going through the data observation by observation when it requires something a bit more complicated such as the average of all other districts here. So my question is --- How to use bys better so that I can rewrite the nested loops with bysort?
2. I seemingly spend an eternity generating empty variables, using egen to create a temporary variable , using egen again to copy the values across all observations, say with max, and then replacing the original empty variable. So my second question is how do I avoid this : egen tempvar1=max(var), replace realvar=tempvar if x==y construct?
Best wishes,
Stuart
cap drop jk_median
cap drop jk_mean
gen jk_median=.
gen jk_mean=.
levelsof year if avg_income_2!=., local(years)
foreach y of local years {
levelsof state if year==y, local(states)
foreach S of local states {
levelsof district if year==y & state=="`S'", local(districts)
foreach d of local districts {
tempvar median mean maxmed maxmean
egen `median'=median(avg_income_2) if district!="`d'" & state=="`S'" & year==`y'
egen `mean'=mean(avg_income_2) if district!="`d'" & state=="`S'" & year==`y'
egen `maxmed'=max(`median') if state=="`S'" & year==`y' & `median'!=.
egen `maxmean'=max(`mean') if state=="`S'" & year==`y' & `mean'!=.
replace jk_median=`maxmed' if district=="`d'" & state=="`S'" & year==`y'
replace jk_mean=`maxmean' if district=="`d'" & state=="`S'" & year==`y'
}
}
}
I frequently write code that looks like the below -- however this tends to be very slow and I wonder if there is a better way. Fundamentally, I have two related problems
1. I don't know how to avoid going through the data observation by observation when it requires something a bit more complicated such as the average of all other districts here. So my question is --- How to use bys better so that I can rewrite the nested loops with bysort?
2. I seemingly spend an eternity generating empty variables, using egen to create a temporary variable , using egen again to copy the values across all observations, say with max, and then replacing the original empty variable. So my second question is how do I avoid this : egen tempvar1=max(var), replace realvar=tempvar if x==y construct?
Best wishes,
Stuart
cap drop jk_median
cap drop jk_mean
gen jk_median=.
gen jk_mean=.
levelsof year if avg_income_2!=., local(years)
foreach y of local years {
levelsof state if year==y, local(states)
foreach S of local states {
levelsof district if year==y & state=="`S'", local(districts)
foreach d of local districts {
tempvar median mean maxmed maxmean
egen `median'=median(avg_income_2) if district!="`d'" & state=="`S'" & year==`y'
egen `mean'=mean(avg_income_2) if district!="`d'" & state=="`S'" & year==`y'
egen `maxmed'=max(`median') if state=="`S'" & year==`y' & `median'!=.
egen `maxmean'=max(`mean') if state=="`S'" & year==`y' & `mean'!=.
replace jk_median=`maxmed' if district=="`d'" & state=="`S'" & year==`y'
replace jk_mean=`maxmean' if district=="`d'" & state=="`S'" & year==`y'
}
}
}

Comment