Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data: Values of first record in 2015

    Hello everyone.

    I have a Panel dataset containing staff information based on performance report dates. The characteristics of the staff can change over time and we track this in this dataset. For example a staff can start in the company as a intern an then became a staff member and then became a manager. I need to create a variable that represents the characteristic of the staff at the first performance report in 2015. In other words, for example, I need to know what was the role of the staff at the first report in 2015. Please see below a sample of my data.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte ID str7 Idchange int Iddate str11(Role WhatIWant)
    1 "2266/12" 19241 "INTERN"      "MANAGER"    
    1 "911/14"  19794 "ASSISTANT"   "MANAGER"    
    1 "1031/15" 20166 "MANAGER"     "MANAGER"    
    1 "4869/15" 20429 "ASSISTANT"   "MANAGER"    
    2 "4869/16" 20127 "SUPERVISOR " "SUPERVISOR "
    2 "4869/17" 20431 "ASSISTANT"   "SUPERVISOR "
    end
    format %tddd-Mon-YY Iddate

    Thank you in advance,
    Marvin

  • #2
    Code:
    egen Firstin2015 = min(Iddate / (yofd(Iddate) == 2015)), by(ID)  
    gen Wanted = Role if Iddate == Firstin2015 
    bysort ID (Wanted) : replace Wanted = Wanted[_N] 
    list, sepby(ID)
    See http://www.stata-journal.com/sjpdf.h...iclenum=dm0055 for the trickiest bit.

    Comment


    • #3
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input byte ID str7 Idchange int Iddate str11(Role WhatIWant)
      1 "2266/12" 19241 "INTERN"      "MANAGER"    
      1 "911/14"  19794 "ASSISTANT"   "MANAGER"    
      1 "1031/15" 20166 "MANAGER"     "MANAGER"    
      1 "4869/15" 20429 "ASSISTANT"   "MANAGER"    
      2 "4869/16" 20127 "SUPERVISOR " "SUPERVISOR "
      2 "4869/17" 20431 "ASSISTANT"   "SUPERVISOR "
      end
      format %tddd-Mon-YY Iddate
      
      gen int year = yofd(Iddate)
      
      by ID year (Iddate), sort: gen first_role_2015 = Role[1]
      replace first_role_2015 = "" if year != 2015
      by ID (first_role_2015), sort: replace first_role_2015 = first_role_2015[_N]
      
      sort ID Iddate
      list, noobs clean
      Note: If there is anybody in the data who has no records in year 2015, that person will have "" as the first 2015 role.

      Added: Crossed with Nick's solution which takes a slightly different approach.

      Comment


      • #4
        Thank you both much! I have some tricks to tag desired cases in panel data but this task was particularly tricky. The final goal is to collapse the data based on 2015 information. Without making these new set of 2015 variables, I would not be able to collapse my dataset by ID and I would get an error message since ID will have different values for these variables.

        Thank you again! Marvin

        Comment


        • #5
          Hi Clyde,

          Will your code work with a Date variable? I updated you code formating the new date variable (ex. % td) and changing .. But I still have problems.
          replace first_role_2015 = "" for
          replace first_role_2015 = .

          Comment

          Working...
          X