I'm planning to conduct a survival analysis. However, I have several questions related to the structure of data. Currently, the following info is known.
Data is available from 2007-2009, here is the definition:
id=firm id
current_yr=current fiscal year
founding_yr=year when the focal firm was found
bankruptcy_yr=year when the focal firm went out of business
OOB=out of business indicator (1 if a firms goes out of business and 0 otherwise)
Here is my question: when I generate the dummy variable "OOB" to represent if a firm filed for bankruptcy (failed==1) or not (failed==0), for cases where the year of bankruptcy exceeds the sample range, how do I properly generate "OOB"? For instance, in the above data set, firm 1 didn't file for bankruptcy within the sample period, so I guess "OOB" should all be equal to 0 for firm 1, correct?
Is there any error in the last command (stset)? Should I specify stset differently in order to conduct survival analysis? Thanks!
Data is available from 2007-2009, here is the definition:
id=firm id
current_yr=current fiscal year
founding_yr=year when the focal firm was found
bankruptcy_yr=year when the focal firm went out of business
OOB=out of business indicator (1 if a firms goes out of business and 0 otherwise)
Code:
id current_yr founding_yr bankruptcy_yr 1 2007 2005 2011 1 2008 2005 2011 2 2007 2007 2010 2 2008 2007 2010 2 2009 2007 2010 gen yr_since_found=current_yr-founding_yr gen OOB=bankruptcy_yr==current_yr
Code:
id current_yr founding_yr bankruptcy_yr OOB 1 2007 2005 2011 0 1 2008 2005 2011 0 2 2007 2007 2010 0 2 2008 2007 2010 0 2 2009 2007 2010 1 stset yr_since_found, failure(OOB==1)
Comment