Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting number of persons graduating each year in long-format data

    My data is time series individual data in long format. I have a variable 'pnr' showing the id of the person, 'year' showing the year, 'degree' showing the type of highest obtained degree of the individual, and 'gradyear' showing the year in which the individual obtained their highest degree. My data looks something like this:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(pnr year gradyear education)
    1 1 . .
    1 2 . .
    1 3 3 1
    1 4 3 1
    2 1 1 2
    2 2 1 2
    2 3 1 2
    2 4 1 2
    3 1 1 1
    3 2 1 1
    3 3 1 1
    3 4 4 2
    end
    I need to count the number of individuals graduating in each year. A
    Code:
    tab gradyear
    won't do because gradyear is stated for each year in the data, so here the graduation year '3' will have a frequency of 2 even though its only person 1 who graduated in year 3. So how do I count the number of occurences of gradyear==3 (and the other year) but only count one occurence pr. person?

  • #2
    Code:
    duplicates drop pnr gradyear, force
    tab gradyear

    Comment


    • #3
      alternatively,
      Code:
      egen tag = tag(pnr gradyear)
      tab gradyear if tag

      Comment


      • #4
        .
        Code:
         egen tag = tag(pnr gradyear) 
        
        . tab gradyear if tag 
        
           gradyear |      Freq.     Percent        Cum.
        ------------+-----------------------------------
                  1 |          2       50.00       50.00
                  3 |          1       25.00       75.00
                  4 |          1       25.00      100.00
        ------------+-----------------------------------
              Total |          4      100.00

        Comment


        • #5
          Thanks all, for your solutions!

          Comment

          Working...
          X