Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Parametric survival models selecting

    Dear All,
    I'm selecting the best-fitting parametric survival models from Exponential, Weibull, Loglogistic, Gamma, Lognormal, Gompertz.
    Finally I got the AIC and loglikelihood statistics, basically, it's better to choose the best-fitting one with the lowest AIC and highest loglikelihood. But the best-fitting one Loglogistic doesn't present most satisfied results of my variables, whereas the results of Weibull are what I want, so which one should I choose?
    I attached the comparison table of each model, the differences between AICs and loglikelihoods are not very big.

    Thank you guys.

    Best
    Josh

    Click image for larger version

Name:	2023-03-23 20.44.29.png
Views:	1
Size:	161.6 KB
ID:	1706879


  • #2
    Choosing the model whose results most accord with your wishes is not science. In fact, it borders on scientific misconduct.

    Ideally, the selection of a model is made before the analysis is even done and is based on previous exploratory studies of the same or closely related question in the same or closely related population/setting. In reality, we often do not have that prior knowledge to rely on. The next best case is if you can argue from the known properties of the models in question which is best. For example, exponential models have constant hazard: are you modeling a process that one would reasonably expect to have a constant hazard? Gompertz distributions have increasing hazard--does that describe the process you are modeling? Etc.

    As a default, if there is no good theoretical understanding of the process that will support the kind of inquiry I've just outlined, then, as a last resort, we can rely on fit tests like AIC or BIC, or log likelihood. (By the way, be careful about comparing log likelihoods from different commands in Stata. Some likelihoods have a constant factor in them, and some Stata's procedures will include that in the log likelihood and others won't, so that it's not an apples to apples comparison. This is not an issue for you because all of these come from the same command, -streg-.)

    By the way, I disagree strongly with your assessment that the log likelihoods here are not very different. Comparing the ll for Weibull and log logistic, their difference is 137. While that may be a small relative difference for numbers in the mid-2000 range, the absolute difference of 137 is actually quite large. That's the difference in the natural logarithms of the likelihood. The likelihood ratio is therefore exp(137) which is massive.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      Choosing the model whose results most accord with your wishes is not science. In fact, it borders on scientific misconduct.

      Ideally, the selection of a model is made before the analysis is even done and is based on previous exploratory studies of the same or closely related question in the same or closely related population/setting. In reality, we often do not have that prior knowledge to rely on. The next best case is if you can argue from the known properties of the models in question which is best. For example, exponential models have constant hazard: are you modeling a process that one would reasonably expect to have a constant hazard? Gompertz distributions have increasing hazard--does that describe the process you are modeling? Etc.

      As a default, if there is no good theoretical understanding of the process that will support the kind of inquiry I've just outlined, then, as a last resort, we can rely on fit tests like AIC or BIC, or log likelihood. (By the way, be careful about comparing log likelihoods from different commands in Stata. Some likelihoods have a constant factor in them, and some Stata's procedures will include that in the log likelihood and others won't, so that it's not an apples to apples comparison. This is not an issue for you because all of these come from the same command, -streg-.)

      By the way, I disagree strongly with your assessment that the log likelihoods here are not very different. Comparing the ll for Weibull and log logistic, their difference is 137. While that may be a small relative difference for numbers in the mid-2000 range, the absolute difference of 137 is actually quite large. That's the difference in the natural logarithms of the likelihood. The likelihood ratio is therefore exp(137) which is massive.
      Thank you Clyde for your comments. Since results of Weibull approve my question supported by the theories and literature, so I'd like know if it is possible to use.
      Your reply really makes sense, thank you again.

      Comment

      Working...
      X