Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Expanding dataset from obervations per job code to observations per year (based on isco)

    Hi
    Hi, Hi, Hi!
    My dataset consists of 450 variables and I need to expand it using the following variables.
    Currently, each line represents a job period. I need to extend the dataset using "start" and "stop" so that each observation represents one year.
    What I have:
    id job code start year stop year
    1 6675 1991 1999
    1 9675 1999 2000
    1 3371 2000 2003
    1 3371 2003 2004
    1 3371 2004 2007
    1 3371 2007 2008
    1 2283 2008 2012
    2 3022 1992 1995
    2 3022 1996 1997
    2 3022 1997 1999
    2 3022 2000 2002
    2 3022 2002 2004
    2 3022 2004 2009
    2 98 2006 2009
    2 3912 2008 2012
    3 2130 1994 1996
    3 2130 1996 1999
    3 95 1999 2000
    3 2130 200 2001
    3 2123 2002 2012
    4 98 1989 1990
    4 1733 1990 2000
    4 1914 2000 2006
    4 7123 2006 2010
    4 3141 1976 2000
    5 1241 2000 2002
    5 2255 2002 2007
    5 2255 2007 2010
    5 2546 2010 2012
    What I need:
    id job code start year stop year year
    1 6675 1991 1999 1991
    1 6675 1991 1999 1992
    1 6675 1991 1999 1993
    1 6675 1991 1999 1994
    1 6675 1991 1999 1995
    1 6675 1991 1999 1996
    1 6675 1991 1999 1997
    1 6675 1991 1999 1998
    1 6675 1991 1999 1999
    1 9675 1999 2000 1999
    1 9675 1999 2000 2000
    1 3371 2000 2003 2000
    1 3371 2000 2003 2001
    1 3371 2000 2003 2002
    1 3371 2000 2003 2003
    1 3371 2003 2004 2003
    1 3371 2003 2004 2004
    1 3371 2004 2007 2004
    1 3371 2004 2007 2005
    1 3371 2004 2007 2006
    1 3371 2004 2007 2007
    1 3371 2007 2008 2007
    1 3371 2007 2008 2008
    1 2283 2008 2012 2008
    1 2283 2008 2012 2009
    1 2283 2008 2012 2010
    1 2283 2008 2012 2011
    1 2283 2008 2012 2012
    2 3022 1992 1996 1992
    I tried using
    Code:
    expand stopyr-startyr+1
    by id, sort: gen year = startyr + _n -1
    But it didn't do the trick.

  • #2
    Welcome to Statalist. In future, please understand the rules in the FAQ and post your data using dataex, it'd be more convenient for other users to set up their data and test their code. And you'll in turn get working code more quickly.

    Code:
    clear
    input id     job_code     start_year     stop_year
    1     6675     1991     1999
    1     9675     1999     2000
    1     3371     2000     2003
    1     3371     2003     2004
    1     3371     2004     2007
    1     3371     2007     2008
    1     2283     2008     2012
    2     3022     1992     1995
    2     3022     1996     1997
    2     3022     1997     1999
    2     3022     2000     2002
    2     3022     2002     2004
    2     3022     2004     2009
    2     98     2006     2009
    2     3912     2008     2012
    3     2130     1994     1996
    3     2130     1996     1999
    3     95     1999     2000
    3     2130     200     2001
    3     2123     2002     2012
    4     98     1989     1990
    4     1733     1990     2000
    4     1914     2000     2006
    4     7123     2006     2010
    4     3141     1976     2000
    5     1241     2000     2002
    5     2255     2002     2007
    5     2255     2007     2010
    5     2546     2010     2012
    end
    
    * Compute elaped years
    gen dur_year = stop_year - start_year + 1
    * Expand by that
    expand dur_year
    * Generate the year variable
    bysort id job_code start_year: gen year = start_year + _n - 1
    * Clean up and sort the data for viewing
    drop dur_year
    gsort id start_year year
    Notice that there is a person starting in year 200.

    Comment


    • #3
      Thank you for helping me, it worked perfectly.
      I'll be sure to use dataex in the future.

      Comment

      Working...
      X