Cox regression: age as time-scale: how to set up data?

Sascha Vum

Join Date: Jul 2018

Posts: 4
#1

Cox regression: age as time-scale: how to set up data?

20 Jul 2018, 07:48

Hi everyone, I'm new to stata and I'm having trouble setting up my data for survival analyses.

Participants have up to 5 measurement waves. Let's say we have two variables that are measured at each wave: age and health. The event is death and we know their date of death (and thus the age at which they died).
If you take this person for example: he was 55,3 when he entered the study and he was 65,1 when he died. So the event took place between wave 4 and 5. At which wave do I say he has the event? Or do I have my data set up all wrong?

I will use age as a time-scale for the Cox regression, but I'm not sure how/where to include the age of the event (or censoring, since censoring due to loss to follow-up also occurs in between waves).

Participants can enter the study at different ages.

id wave age health status died

1 1 55,3 0

1 2 58,7 1

1 3 61,3 1

1 4 64,5 2

1 5

Can anyone help me to set up my data?
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#2

20 Jul 2018, 10:35

Well, as far as I can see, the wave variable is irrelevant. Your failure event is death, and your time variable is age, and you know the actual age at which it occurs, regardless of whether that falls between waves or not. So your data needs to record a death event at age 65.1 It doesn't matter which wave you say that is, nor, for that matter whether you record a wave number for the event at all. The wave number will play no role in your analysis.

For those who are lost to follow, the rule is that the person is considered to be censored as of the last moment at which he or she is known to have not yet failed, ie. the largest age at which the person was last known to be alive. So if a person enters the study at age 62 and the last contact with that person is at age 64.7, and her status from that point on is unknown due to drop out, she is censored at age 64.7. Again, it doesn't matter what wave number you call that (nor whether you even bother to give it a wave number.)
Comment
Sascha Vum

Join Date: Jul 2018

Posts: 4
#3

21 Jul 2018, 01:24

Hi Clyde,

thanks for your reply. I also thought wave did not really play a role in this case, since we’re interested in age. But then how do I reshape my data from wide to long? Because the age is different for everyone, so I can’t make that the stubname. Sorry, I’m very new to this and have only worked with SPSS.
So how do I get a datafile that links age to the event?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#4

21 Jul 2018, 09:37

The -snapspan- command will do this for you. Run -help snapspan- and click on the link to the PDF documentation near the top of that page. There you will find the details of how to use this command, clearly explained, with worked examples.
Comment
Sascha Vum

Join Date: Jul 2018

Posts: 4
#5

24 Jul 2018, 10:02

Hi Clyde, I've been struggling the last couple of days to understand everything, but I'm afraid I still don't get it. I think I'm doing it all wrong. I now have the following file:

id education health status event age wave

1 1 1 0 55.3 1

1 1 2 0 58.5 2

1 1 . 1 59.1 3

1 1 . . . 4

2 0 3 0 59.8 1

2 0 3 0 63.1 2

2 0 4 0 66.0 3

2 0 3 0 69.2 4

Person1 enters the study aged 55.3 and dies at 59.1. He has two measurements (wave 1 and 2) in which the covariate (health status) was measured. He died between wave 2 and 3. (it is assumed that the health status stays the same from the last measurement until the point that he dies). Education is constant over time.
Person2 enters the study aged 59.8 and stays in the study until the last wave (wave 4) without dying.
Is my data set up the right way to use snapspan? I've read the manual, but I just don't really understand why the data has to be altered.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#6

24 Jul 2018, 11:51

OK, I didn't see that you want to use time-varying covariates. So, your data is almost correctly set up. If you want health status to be assumed to stay the same after it is first identified, you have to fill in the missing values carrying them forward. If you leave them as missing values, those observations will be excluded from the analysis, which will really mess things up badly here. Once you take care of that, you should be OK to -stset- this as multiple observations per person data and proceed.

In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

When asking for help with code, always show example data. When showing example data, always use -dataex-.
Comment
Sascha Vum

Join Date: Jul 2018

Posts: 4
#7

25 Jul 2018, 07:41

Thank you so much for your help!
Comment

id	wave	age	health status	died
1	1	55,3	0
1	2	58,7	1
1	3	61,3	1
1	4	64,5	2
1	5

id	education	health status	event	age	wave
1	1	1	0	55.3	1
1	1	2	0	58.5	2
1	1	.	1	59.1	3
1	1	.	.	.	4
2	0	3	0	59.8	1
2	0	3	0	63.1	2
2	0	4	0	66.0	3
2	0	3	0	69.2	4

Announcement

Cox regression: age as time-scale: how to set up data?

Comment

Comment

Comment

Comment

Comment

Comment