Hi all, please consider the following example
For each postcode that has opened, I have identified a year, whose postcode I will use as replacement. Eg., student 123 is in postcode A in 2011. Since A is a newly opened postcode, I want to replace it for continuity. Accordingly, I have identified year 2012 to be the year whose info I will be using. Student 123 was in postcode B in 2012, hence I want to replace A with B in 2012 for 123.
What I had tried initially, which was wrong was
Which generated variable with all missing values because for no student-acadyear combination, is there a match between replaceyear and acadyr. Had there been a single replaceyears for each student, i could have generated that year against all observations of the student and then found the match. However, since there are multiple replaceyears , that is not an option.
Would appreciate any suggestion.
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str3 studentid float acadyr str1 postcode float(open replaceyear) str1(replacepcd replacepcd_exp) "123" 2010 "B" 0 . "" "" "123" 2011 "A" 1 2012 "" "B" "123" 2012 "B" 0 . "" "" "123" 2013 "C" 1 2014 "" "S" "123" 2014 "S" 0 . "" "" "123" 2015 "D" 1 2010 "" "B" "124" 2012 "S" 0 . "" "" "124" 2013 "C" 1 2012 "" "S" "124" 2015 "C" 1 2012 "" "S" "124" 2016 "C" 1 2012 "" "S" "124" 2017 "S" 0 . "" "" "126" 2012 "S" 0 . "" "" "126" 2014 "B" 0 . "" "" "126" 2015 "C" 1 2016 "" "B" "126" 2016 "B" 0 . "" "" "126" 2017 "A" 1 2016 "" "B" "126" 2018 "A" 1 2016 "" "B" end
What I had tried initially, which was wrong was
Code:
sort studentid acadyr by studentid: gen replacepcd=postcode if replaceyear==acadyr
Would appreciate any suggestion.
Comment