I am currently working on a dataset of physicians, with variables year* identifying the years they were registered in a database:
ID year1 year2 year3 year4 year5
1 1995 1996 1997 1998 1999
2 1995 1997 1998 1999 2000
3 1997 2001 2002 2003 2004
4 2003 2005 2006 2010 2011
I am trying to generate a series of variables to identify how many times they dropped out, as well as the intervals in which these physicians are not registered in that database:
ID missed yearmissedbeg1 yearmissedend1 yearmissedbeg2 yearmissedend2
1 0 . . . .
2 1 1996 1996 . .
3 1 1998 2000 . .
4 2 2004 2004 2007 2009
I have read the STATA FAQ pages on tsset, but it seems I would have to reshape long my database in order for that solution to work, which would not be ideal as I have +17,000 observations listed by 16 years, as well as 40 other variables for the physicians.
Any help would be appreciated.
ID year1 year2 year3 year4 year5
1 1995 1996 1997 1998 1999
2 1995 1997 1998 1999 2000
3 1997 2001 2002 2003 2004
4 2003 2005 2006 2010 2011
I am trying to generate a series of variables to identify how many times they dropped out, as well as the intervals in which these physicians are not registered in that database:
ID missed yearmissedbeg1 yearmissedend1 yearmissedbeg2 yearmissedend2
1 0 . . . .
2 1 1996 1996 . .
3 1 1998 2000 . .
4 2 2004 2004 2007 2009
I have read the STATA FAQ pages on tsset, but it seems I would have to reshape long my database in order for that solution to work, which would not be ideal as I have +17,000 observations listed by 16 years, as well as 40 other variables for the physicians.
Any help would be appreciated.
Comment