My data is time series individual data in long format. I have a variable 'pnr' showing the id of the person, 'year' showing the year, 'degree' showing the type of highest obtained degree of the individual, and 'gradyear' showing the year in which the individual obtained their highest degree. My data looks something like this:
I need to count the number of individuals graduating in each year. A
won't do because gradyear is stated for each year in the data, so here the graduation year '3' will have a frequency of 2 even though its only person 1 who graduated in year 3. So how do I count the number of occurences of gradyear==3 (and the other year) but only count one occurence pr. person?
Code:
* Example generated by -dataex-. For more info, type help dataex clear input byte(pnr year gradyear education) 1 1 . . 1 2 . . 1 3 3 1 1 4 3 1 2 1 1 2 2 2 1 2 2 3 1 2 2 4 1 2 3 1 1 1 3 2 1 1 3 3 1 1 3 4 4 2 end
Code:
tab gradyear

Comment