Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fill in the missing years between start_year and end_year (not tsfill)

    Hi all,

    I met the following problem. Suppose my dataset is as follow. Each record is a career history(position_id) for an individual (user_id). Some individual we have multiple records (like here user_id == 103).

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(position_id user_id) str8 job_title float(start_year end_year)
    1 101 "manager"  2002 2007
    2 102 "engineer" 2005 2009
    3 103 "analyst"  2000 2002
    4 103 "consultor" 2001 2004
    end
    What I want is the following data set, that is: for each position_id, I copy the record for x times, where x = end_year-start_year+1, and then generate this new variable called year that starts from the start_year and goes to the end_year
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte position_id int user_id str9 job_title int(start_year end_year year)
    1 101 "manager"   2002 2007 2002
    1 101 "manager"   2002 2007 2003
    1 101 "manager"   2002 2007 2004
    1 101 "manager"   2002 2007 2005
    1 101 "manager"   2002 2007 2006
    1 101 "manager"   2002 2007 2006
    2 102 "engineer"  2005 2009 2005
    2 102 "engineer"  2005 2009 2006
    2 102 "engineer"  2005 2009 2007
    2 102 "engineer"  2005 2009 2008
    2 102 "engineer"  2005 2009 2009
    3 103 "analyst"   2000 2002 2000
    3 103 "analyst"   2000 2002 2001
    3 103 "analyst"   2000 2002 2002
    4 103 "consultor" 2001 2004 2001
    4 103 "consultor" 2001 2004 2002
    4 103 "consultor" 2001 2004 2003
    4 103 "consultor" 2001 2004 2004
    end

    I tried to use -tsfill-, but it can only help me creat a balance panel where for each position_id I have the same years, but still the years between the start_year and end_year have missing records. Do you know how to do the thing I described above?

    Thank you so much

  • #2
    Code:
    isid user_id position_id
    expand end_year-start_year + 1
    by user_id position_id, sort: gen year = start_year + _n - 1

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      Code:
      isid user_id position_id
      expand end_year-start_year + 1
      by user_id position_id, sort: gen year = start_year + _n - 1
      Thank you Clyde, this is exactly what I need!I should be more familiar with the use of _n within a group I guess

      Comment

      Working...
      X