Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Heckman correction for survival analysis

    Hello,
    I am working on an analysis on startup performance. The main focus of the study is understanding if startups created by founders coming from firms with different sizes have different performance values: that is, do firms created by founders coming from small firms fare better or worse than firms created by founders coming from large firms?

    I have a dataset at the firm-year level, with extensive information at the firm level, and also with info of the founder. To measure performance, I am running correlated random effects models , using the logarithm of gross value added as the dependent variable. My sample is composed of startups created from 2005-2015, and I follow them until 2019, or until they die.

    I am facing one possible issue that may be biasing my results, and for which I need some help: a lot of startups close after only a few years of opening. To partially mitigate this issue, I am restricting my analysis to firms that survive at least 2 years (that is, not including firms that die after only 1 year). But I'd like to address it further, because, as it is, my sample is composed largely of "survivors", which makes it so that firms that survive longer have many more observations in my sample than those who die along the way.

    When researching about how to deal with issues of missing data/observations, I came across Heckman correction, which seems to be one of the most widely used. I thought of applying it in my study, but I have an issue which I do not know how to overcome: in the heckman correction examples I have seen, even though some observations had missing dependent variable values, the rows on which the dependent variable was missing still had values for the other variables. In the classic example of running regression using wage as the dependent variable, the issue is that only people who are working have a wage (that is, only people who have Employed==1). But, for the people who Employed==0, we still have info on them, other than wage.

    When it comes to my analysis, I only have info on the firms if Survived==1. The issue is that if a start-up dies, they don't appear any longer in the dataset, period, so we don't have info on them.

    Given this, is it possible, at all, to apply heckman correction in my case? Are Heckman correction ever applicable for survival models?

    If not, what other options do you know of that could help me mitigate the possible bias issues that my models may have?

    Thank you very much for any help you may provide!
    Rui Agostinho

Working...
X