Dear Clyde Schechter Nick Cox
I am trying to set up correctly my dataset to perform a survival analysis of bank-borrower relationship. Therefore, my failure event is getting a loan after the treatment starts (2020Q3). My dataset is at the loan level, but I want to perform this analysis at both bank-borrower and bank-borrower-quarter level. This is my code:
However, when running cox regression and plot the results for both groups (T and C), both lines converges to zero, which is clearly an error since relationships don't end, but my dataset does at 2023Q4. Then, my questions are:
1. How can I deal with this problem (right censorship)? I've read that I can replace last quarter with 0 or missing to fix this, but I don't understand why.
2. When generating 'time' variable, it seems that it is non missing only for those with failure == 1. Is ok to replace those missings with 0?
I am trying to set up correctly my dataset to perform a survival analysis of bank-borrower relationship. Therefore, my failure event is getting a loan after the treatment starts (2020Q3). My dataset is at the loan level, but I want to perform this analysis at both bank-borrower and bank-borrower-quarter level. This is my code:
Code:
bys uniqueborrowerid bankid (numeric_quarter): gen failure = (cond(nr_loans,numeric_quarter>=242,.))
tempvar minq maxq
bys uniqueborrowerid bankid: egen `minq' = min(numeric_quarter) // when the relationship started
bys uniqueborrowerid bankid: gen `maxq' = numeric_quarter if failure==1 // identifies when failure happens
gen time = `maxq' - `minq'
egen relationid = group(uniqueborrowerID bankid)
collapse (max) failure treat time (mean) nloans_bfaceli, by(relationid numeric_quarter)
stset time, id(relationid) failure(failure==1)
stcox i.treat, vce(cluster relationid)
stcurve, survival at1(treat=0) at2(treat=1) title("Cox survival of bank-borrower relationship by treatment") xtitle("Analysis time (in quarters)") legend(pos(8) col(2) ring(0)) xlabel(#8)
1. How can I deal with this problem (right censorship)? I've read that I can replace last quarter with 0 or missing to fix this, but I don't understand why.
2. When generating 'time' variable, it seems that it is non missing only for those with failure == 1. Is ok to replace those missings with 0?

Comment