Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to transform probability of survival to the expected survival time

    Hi,

    I was wondering whether there is a way in Stata to find the expected survival time instead of probability of survival?

    Here is my specific case. In the following sample survival is staying with the organization. I have data on whether the person was with the organization in the end of 2013, 2014, 2015, and 2016. The missing values show the person was not with the organization at the end of that year. They either were not hired yet or left the organization. I have tenure in the organization, whether the person turned over, and their composite performance (overall performance over time in the organization).

    I really appreciate your help.

    Code:
    input int age int retention_2013 int retention_2014 int retention_2015 int retention_2016 float composite_performance int tenure int turnover
    age retention_2013 retention_2014 retention_2015 retention_2016 composite_performance tenure turnover
    
    35 1 1 . . 0.32 5 1
    44 . 1 1 1 0.80 4 0
    35 . 1 . . -0.42 2 1
    41 . 1 1 . -0.392 3 1
    43 . . . 1 -0.9 5 0
    73 1 1 1 1 -0.89 6 0
    60 . . . 1 -0.94 5 0
    42 1 . . . 0.153 4 1
    end

  • #2
    It can't be done. These survival data are censored: there are some people who are still retained as of the end of data collection. Consequently the expected survival cannot be estimated because survival for these people who are still retained is unknown.

    Comment


    • #3
      Thanks Clyde,

      Do you think the only way to predict expected tenure (expected time the person will stay in the organization) is using interval regression? I'm not sure if it's the right way to do it.

      Something like:

      Code:
       
       gen tenure2 = tenure if turnover==1    
       intreg tenure tenure2 age composite_performance  predict tenure
      Thanks for your help

      Comment


      • #4
        -intreg- makes strong parametric (normality) assumptions which are generally violated by time-to-event variables. It's not a question of picking a different command really. The problem is that your data does not provide the information needed to have a valid estimate of expected times.

        One way around this is to get a data set that has complete data: everybody in the data set has already completed their tenure and is no longer there. In that case, you can calculated expected values. The problem is that such a data set presumably embodies a cohort or cohorts that began fairly far in the past and inferences from them to the present situation can be questionable.

        Another approach is to use a parametric model: the parametric assumption imposes constraints that substitute for information in the data. -intreg- is one such, but its use of a normal distribution makes it poorly suited to the reality that is typical for time-to-event variables. The -streg- command offers a variety of parametric models that can be fit to the data, most of which support estimation of expected durations. Which of the distribution families available is most suited to your kind of data is a content-area question that I wouldn't be able to advise you on.

        Comment


        • #5
          Thanks for the helpful information. If I choose a relevant parametric distribution in streg, is it possible to predict values after that? Does the predict command work for streg too?

          Comment


          • #6
            Yes, if you read -help streg postestimation- you will see that you can use -predict- after -streg-, and, in partiucular, you can ask for mean time as the predicted outcome (unless you use a Gompertz model).

            Before you go jumping into this, do read the manual sections on -stset-, -streg-, and -streg postestimation-. I'm inferring that you're not familiar with survival analysis commands in Stata because it was not your first inclination to use them for this problem. Bear in mind that you will also have to do some data management to get your data in the right form to apply -stset- and run survival analysis commands.

            And again, I recommend you consult with somebody who has some expertise in tenure issues to get a sense of which of the parametric survival models is most suitable for modeling tenure in an organization such as the one this data comes from.

            Comment


            • #7
              Thank you very much. I will ask my advisor about how we need to approach survival analysis and the right distribution.

              Comment

              Working...
              X