Why is stata generating _d = . when my failures are all recorded

Martin Imelda Borg

Join Date: Jan 2022

Posts: 225
#1

Why is stata generating _d = . when my failures are all recorded

26 Jul 2023, 07:54

Hi, I'm plotting my survival curves.
For My time variable (survivalt) - I don't have missing values) - this is adjcanet to timetodeath variable
For my failure variable (revised) - I don't have missing values

Yet, stata generates a few missing variables _d = . and _t = missing.

Why is stata coding them as missing when actually they are not missing.

And perhaps this may explain why I can not get my 25% 50% 75% survival times? Unless someone else can offer another explanation for this

Code:

stsum, by(tkronly)

Please note - I know I've submitted screenshots which goes against statalist <3 but I needed you to see the data rather than sample data to check if I'm doing anything glaringly wrong, because I can't see a problem.
Tags: None
Maarten Buis

Join Date: Mar 2014

Posts: 3467
#2

26 Jul 2023, 08:09

My eyes are not as good as they used to be (and they were never very good to begin with). So those screenshots are completely useless to me. Also what is shown on screen is not necessarily the same as what is in your data. So we really really do need sample data and not screenshots. Also we really need the exact command you use to stset the data. That is probably where your problem originates.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
Martin Imelda Borg

Join Date: Jan 2022

Posts: 225
#3

26 Jul 2023, 08:44

That's a pity. As I have said I am unable to provide a real copy of the data as it is stored on a remote platform which has no internet access. I normally play around with sample data and use this as examples. But in this case, I can not as I'm quite sure I'm using the right code. I thought someone on statalist would know why missing variables are generated following stsum

Code:

stset survivalt, failure(revised==1)
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2405
#4

26 Jul 2023, 11:24

From the screenshot you show, I can just make out what the problem is (or one problem), assuming you have used the -stset- statement in #3.

If survival time is 0, these observations are necessarily excluded from the analysis population because they have not survived any amount of time, by definition.

Also, as you have pointed out in another related thread, you know that you can post fake data that replicates some feature of your data. Please keep that in mind when asking for help. We don’t need your real data, just realistic or toy data from which to work.
1 like
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30164
#5

26 Jul 2023, 12:26

To add to what Leonardo Guizzetti has said, I'll respond to your other question:

And perhaps this may explain why I can not get my 25% 50% 75% survival times? Unless someone else can offer another explanation for this

The reason that the times for 50% and 75% failiure (and, in the case of TKR, 25%, too) are missing is that your rate of failure is so low that the times to these proportions of failure are longer than anything observed in your data. In other words, even at the end the longest observation period in your study, you still have not had 25% of the people/firms/whatever they are fail in the TKR group. In the THR group, you do achieve 25% of the units failing by t = 16.22466, but the failure percentage never does get up to 50% even by the end of your data.
Comment
Martin Imelda Borg

Join Date: Jan 2022

Posts: 225
#6

28 Jul 2023, 02:52

Originally posted by Clyde Schechter View Post

To add to what Leonardo Guizzetti has said, I'll respond to your other question:

The reason that the times for 50% and 75% failiure (and, in the case of TKR, 25%, too) are missing is that your rate of failure is so low that the times to these proportions of failure are longer than anything observed in your data. In other words, even at the end the longest observation period in your study, you still have not had 25% of the people/firms/whatever they are fail in the TKR group. In the THR group, you do achieve 25% of the units failing by t = 16.22466, but the failure percentage never does get up to 50% even by the end of your data.

This is really interesting. Thanks for this perspective. Do I just accept that the procedure is so successful there is no 50% failure rate?
How does one normally address this?
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2405
#7

28 Jul 2023, 06:02

Originally posted by Martin Imelda Borg View Post

This is really interesting. Thanks for this perspective. Do I just accept that the procedure is so successful there is no 50% failure rate?
How does one normally address this?

Pretty much, yes. You could wait longer to accumulate more failures, but that’s not usually feasible, especially with such a low incidence rate of failure.
1 like
Comment

Announcement

Why is stata generating _d = . when my failures are all recorded

Comment

Comment

Comment

Comment

Comment

Comment