Dear all,
I would like to apply survival analysis techniques to my dataset but I have some problems in declaring survival-time data in STATA 14.
I have an unbalanced panel containing annual financial data for firms and a variable (exit) equal to 1 in the year of firm's exit from the market and 0 in previous years.
The majority of firms in my data do not exit from the market and for some firms I have missing values for the variable "exit" in all years.
I tried to run the following command:
stset year, id(ID) failure(exit=1)
I have some doubts about this command and the four generated variables:
1) _st (=1 if record is to be used; 0 otherwise). For all observations "_st" is equal to 1 (I had 0 exclusions). How is it possible if my "exit" variable is missing for some firms? How STATA can use these observations?
2) _d (= 1 if failure; 0 if censored). The variable "_d "is equal to 1 when firms exit (thus when "exit" is equal to 1) and 0 otherwise. I cannot understand why it is not missing for firms with the "exit" variable missing. STATA consider firms without information on their exit as censored observations?
3) _t (analysis time when record ends). This variable is equal to "year", as I expected.
4) _t0 (analysis time when record begins). This variable is equal to "year"-1 whit the exception for the first year for each firm. "_t0" is equal to 0 for each firm in the first year in which firm appears in the dataset. I do not know if it is correct or if it could generate errors in the analysis.
Thanks a lot in advance for your help.
Best wishes, Chiara
I would like to apply survival analysis techniques to my dataset but I have some problems in declaring survival-time data in STATA 14.
I have an unbalanced panel containing annual financial data for firms and a variable (exit) equal to 1 in the year of firm's exit from the market and 0 in previous years.
The majority of firms in my data do not exit from the market and for some firms I have missing values for the variable "exit" in all years.
I tried to run the following command:
stset year, id(ID) failure(exit=1)
I have some doubts about this command and the four generated variables:
1) _st (=1 if record is to be used; 0 otherwise). For all observations "_st" is equal to 1 (I had 0 exclusions). How is it possible if my "exit" variable is missing for some firms? How STATA can use these observations?
2) _d (= 1 if failure; 0 if censored). The variable "_d "is equal to 1 when firms exit (thus when "exit" is equal to 1) and 0 otherwise. I cannot understand why it is not missing for firms with the "exit" variable missing. STATA consider firms without information on their exit as censored observations?
3) _t (analysis time when record ends). This variable is equal to "year", as I expected.
4) _t0 (analysis time when record begins). This variable is equal to "year"-1 whit the exception for the first year for each firm. "_t0" is equal to 0 for each firm in the first year in which firm appears in the dataset. I do not know if it is correct or if it could generate errors in the analysis.
Thanks a lot in advance for your help.
Best wishes, Chiara
Comment