Help with time-dependent variable in longitudinal data

Giorgia Sulis

Join Date: Feb 2019

Posts: 11
#1

Help with time-dependent variable in longitudinal data

07 Mar 2019, 18:27

Hi everyone,

I am working on panel data as shown below:

I want to generate a time-dependent covariate for the total duration of past use up to time t. To do so, I thought I could use the "egen" command to sum up the values of curr_use up to time t, by id. However, I only managed to get the total duration of use for each subject (see below), i.e. a fixed-in-time variable, which is not what I need:

by id, sort: egen tot_duration = total(curr_use)

Could you please help me with the code? Thanks a lot!
Tags: None
William Lisowski

Join Date: Dec 2014

Posts: 10150
#2

07 Mar 2019, 18:51

Perhaps the following will do what you need, where I assume "time" is the name of the variable with the time that you want to sum across.

Code:

by id (time), sort: generate tot_duration = sum(curr_use)

With that said, you will notice that your attempt to include a screeshot of your data didn't work. You might take a few minutes to review the Statalist FAQ linked to from the top of the page for great advice on the most effective ways to post here. Pay particular attention to #12 where you will learn that screenshots are probably the least useful way to show data. In the future, when showing data examples, please use the dataex command to do so. If you are running version 15.1 or a fully updated version 14.2, dataex is already part of your official Stata installation. If not, run ssc install dataex to get it. Either way, run the help dataex command to read the simple instructions for using it. dataex includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

Last edited by William Lisowski; 07 Mar 2019, 18:57.
Comment

Giorgia Sulis

Join Date: Feb 2019
Posts: 11

07 Mar 2019, 19:47

Thank you very much for the tips and for your help with the code! It worked!

What if I want to generate a variable for the cumulative dose in the last 3 days for each id, i.e. the sum of all previous doses from time (t-3) to time t?
Hopefully, you will now be able to visualize the following dataset example:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int(id time) byte event int(fup start stop) float(dose curr_use dur_t cum_dose)
1 1 0 9 1 2   0 0 0    0
1 2 0 9 2 3 2.6 1 1  2.6
1 3 0 9 3 4 2.6 1 2  5.2
1 4 0 9 4 5 2.6 1 3  7.8
1 5 0 9 5 6 2.6 1 4 10.4
1 6 0 9 6 7 1.3 1 5 11.7
1 7 0 9 7 8 1.3 1 6   13
1 8 1 9 8 9 1.3 1 7 14.3
2 1 0 6 1 2 2.6 1 1  2.6
2 2 0 6 2 3 2.6 1 2  5.2
2 3 0 6 3 4   0 0 2  5.2
2 4 0 6 4 5   0 0 2  5.2
2 5 0 6 5 6   0 0 2  5.2
3 1 0 8 1 2   0 0 0    0
3 2 0 8 2 3   0 0 0    0
3 3 0 8 3 4 1.3 1 1  1.3
3 4 0 8 4 5 1.3 1 2  2.6
3 5 0 8 5 6 1.3 1 3  3.9
3 6 0 8 6 7 1.3 1 4  5.2
3 7 1 8 7 8 1.3 1 5  6.5
end

Comment

William Lisowski

Join Date: Dec 2014
Posts: 10150

07 Mar 2019, 20:09

Two things are not clear. First, days t-3 to t constitute 4 days rather than "the last 3 days", so you may need to adjust the sample code below. Second, what to do about the first few days when you don't have 3 (or 4) days of data? With that said, here's are two different techniques to choose from (and thank you for the example data posted with dataex).

Code:

. bysort id (time): generate c3doseA = cum_dose - cum_dose[_n-3]
(9 missing values generated)

. bysort id (time): generate c3doseB = cum_dose - cond(_n>3,cum_dose[_n-3],0)

. format dose cum_dose c3doseA c3doseB %9.1f

. list id time dose cum_dose c3doseA c3doseB, noobs sepby(id)

  +-------------------------------------------------+
  | id   time   dose   cum_dose   c3doseA   c3doseB |
  |-------------------------------------------------|
  |  1      1    0.0        0.0         .       0.0 |
  |  1      2    2.6        2.6         .       2.6 |
  |  1      3    2.6        5.2         .       5.2 |
  |  1      4    2.6        7.8       7.8       7.8 |
  |  1      5    2.6       10.4       7.8       7.8 |
  |  1      6    1.3       11.7       6.5       6.5 |
  |  1      7    1.3       13.0       5.2       5.2 |
  |  1      8    1.3       14.3       3.9       3.9 |
  |-------------------------------------------------|
  |  2      1    2.6        2.6         .       2.6 |
  |  2      2    2.6        5.2         .       5.2 |
  |  2      3    0.0        5.2         .       5.2 |
  |  2      4    0.0        5.2       2.6       2.6 |
  |  2      5    0.0        5.2       0.0       0.0 |
  |-------------------------------------------------|
  |  3      1    0.0        0.0         .       0.0 |
  |  3      2    0.0        0.0         .       0.0 |
  |  3      3    1.3        1.3         .       1.3 |
  |  3      4    1.3        2.6       2.6       2.6 |
  |  3      5    1.3        3.9       3.9       3.9 |
  |  3      6    1.3        5.2       3.9       3.9 |
  |  3      7    1.3        6.5       3.9       3.9 |
  +-------------------------------------------------+

Last edited by William Lisowski; 07 Mar 2019, 20:13.

Comment

William Lisowski

Join Date: Dec 2014

Posts: 10150
#5

08 Mar 2019, 08:05

On reflection I should add one note to post #4. I have assumed that the values of the time variable are - essentially - just sequence numbers, and that there are no gaps. If instead they were an actual time, like "1 hour after inception, 2 hours after inception, ... " and if it were possible that an id might have gaps in the sequence of times, and if you were interested in the last 3 hours rather than the last 3 doses, we'd need to modify this code to take that into account. The code I wrote essentially ignores the actual values of the time variable, and just uses it for sorting purposes.
Comment
Giorgia Sulis

Join Date: Feb 2019

Posts: 11
#6

08 Mar 2019, 14:30

Thank you very much for your help!

I am aware that this variable is not applicable to the first few days, but I will think about it. Anyhow, your coding suggestion is really useful!

Ans yes, the values of the time variable have no gaps, so that is not a problem!

Thank you again!
1 like
Comment
Giorgia Sulis

Join Date: Feb 2019

Posts: 11
#7

11 Mar 2019, 18:26

Hello again!

I have another question concerning the analysis of the data reported in post #3. I want to fit a Cox model with dose as a time-dependent variable. Is it correct to do it as follows?

Code:

stset time, failure(event) id(id) stcox dose, tvc(dose) texp(ln(time))

Thanks for your help!
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#8

11 Mar 2019, 18:31

I will have to yield to someone else, since survival analysis is not a strength of mine.
Comment

Announcement