I would like to calculate the leave-out weighted median of a variable within groups. The following code illustrates my problem, calculating the median sale price for all other foreign or domestic cars, for each make of car. It is built on this FAQ answer and this answer to a previous question on Statalist:
The problem is that this code is much too slow. I have a dataset of around 20 million observations, so this calculation needs to be done much more quickly. Is there a way I can vectorise this operation, or at least speed it up dramatically?
Code:
sysuse auto, clear gen sales = floor(uniform()*100) // create artificial weight variable capture drop leave_out_med_sales gen leave_out_med_sales = . capture drop temp tempvar temp forvalues i = 1/`=_N' { qui gen temp = price qui replace temp = . if _n == `i' qui su price [w = sales] if foreign == foreign[`i'], detail qui replace leave_out_med_sales = r(p50) if _n == `i' drop temp }
Comment