Dear all,
I have monthly spell dataset with information about pupils educational achievements and their final grades (see dataex example bellow). My aim is to aggregate the information included in the spell file in a way that I, at the end, have a variable that shows me for each individual which school-leaving certificate and overall grade a person has achieved.
Thus far I used the following code to restrict my data in a way that for each individual only the last school-leaving certificate remains.
PRIOR STEPS:
LATER STEPS:
Anyway. I thought that maybe one of you guys knows a better way to do this all in one step.
Thank you in advance for you help. I'm curious to see what solutions you guys may find. If it matters, I'm using Stata 14.
Jonas
I have monthly spell dataset with information about pupils educational achievements and their final grades (see dataex example bellow). My aim is to aggregate the information included in the spell file in a way that I, at the end, have a variable that shows me for each individual which school-leaving certificate and overall grade a person has achieved.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long id byte wave_sp long splink byte certificate double grade byte(splast endm) int endy byte tag 4010912 5 220002 3 1.2999999523162842 2 6 2012 1 4010912 9 220003 5 1.600000023841858 2 7 2015 1 4010913 9 220004 5 3 2 7 2015 1 4010913 9 220003 3 2.0999999046325684 2 7 2012 1 4010914 5 220002 3 2.5 2 7 2012 0 4010915 5 220004 2 3.200000047683716 2 7 2012 0 4010918 9 220003 4 3 2 7 2015 1 4010918 7 220002 3 -20 2 7 2013 1 4010981 9 220005 5 1.899999976158142 2 6 2015 1 4010981 5 220004 3 1.899999976158142 2 6 2012 1 4011534 5 220003 3 2.799999952316284 2 9 2012 2 4011534 5 220002 2 3 2 8 2011 2 4011534 7 220004 5 4 2 7 2013 2 end label values wave_sp en2574 label def en2574 5 "2012/2013", modify label def en2574 7 "2013/2014", modify label def en2574 9 "2015/2016", modify label values certificate en2978ext1 label values grade enext1 label values splast en177 label def en177 2 "No", modify label values endm en1874 label def en1874 6 "June", modify label def en1874 7 "July", modify label def en1874 8 "August", modify label def en1874 9 "September", modify label values endy enmiss
Thus far I used the following code to restrict my data in a way that for each individual only the last school-leaving certificate remains.
PRIOR STEPS:
- Here I merged two data files and selected the variables I need for my analysis.
- Furthermore, I deleted duplicates without differences in id, endy, certificate and splast.
- I also tagged the duplicates that remained in my dataset.
Code:
tempvar last_year tempvar last_month bysort id: egen `last_year'=max(endy) bysort id: egen `last_month'=max(endm) keep if `last_year'==endy keep if `last_month'==endm drop tag `last_year'
- Combine the data from above with another dataset.
Anyway. I thought that maybe one of you guys knows a better way to do this all in one step.
Thank you in advance for you help. I'm curious to see what solutions you guys may find. If it matters, I'm using Stata 14.
Jonas