I am working with real estate appraisal data. Each appraisal (appr_id) has multiple comps (comp_id) associated with it. Comps can be used across multiple appraisals. My data contains 27 million observations, and looks like:
Appraisal 1 has a comp in common with appraisal 2 (h), and appraisal 3 has a comp in common with appraisal 2 (a), so appraisals 1-3 should have the same value in the new variable. Appraisal 4 and appraisal 5 both use p as a comp, so they should have the same value in the new variable. Appraisal 6 does not share a comp with any of the other appraisals, so it should have its own value in the new variable. The final dataset should look like:
I'm guessing there's an easy solution, but I've hit the wall. Any help would be greatly appreciated. Thanks!
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float unique_id str1(appr_id comp_id) 1 "1" "h" 2 "1" "c" 3 "1" "s" 4 "2" "a" 5 "2" "h" 6 "2" "m" 7 "3" "a" 8 "3" "y" 9 "3" "x" 10 "4" "p" 11 "4" "q" 12 "5" "z" 13 "5" "p" 14 "6" "w" 15 "6" "t" end
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float unique_id str1(appr_id comp_id) float newvar 1 "1" "h" 1 2 "1" "c" 1 3 "1" "s" 1 4 "2" "a" 1 5 "2" "h" 1 6 "2" "m" 1 7 "3" "a" 1 8 "3" "y" 1 9 "3" "x" 1 10 "4" "p" 2 11 "4" "q" 2 12 "5" "z" 2 13 "5" "p" 2 14 "6" "w" 3 15 "6" "t" 3 end
I'm guessing there's an easy solution, but I've hit the wall. Any help would be greatly appreciated. Thanks!
Comment