Calculating cumulative time at risk for a rolling window and varying number of distinct values

Jean Spinks

Join Date: Aug 2020

Posts: 4
#1

Calculating cumulative time at risk for a rolling window and varying number of distinct values

22 Aug 2022, 06:35

Dear Statalist,

I am working with a medication dataset, where a higher number of concurrent medications represents a greater risk of side effects. We are working to determine the best cut-off values for the number of concurrent meds as a screening tool.

The data is in long form, each observation represents one supply of a particular medication "drugname" (encoded to a numeric value) for an individual (identified by "id") on a particular date ("scriptdate") which has a certain amount of supply until a later date ("enddate"). We are interested in a particular drug group (denoted by a dummy "dbi" (not shown)).

I am using Stata 17.

Example data is shown below:

Code:

* Example generated by -dataex-. For more info, type help dataex clear input byte id str21 med int scriptdate long enddate byte ndistinct 1 "paracetamol + codeine" 21663 21693 2 1 "olanzepine" 21663 21693 2 2 "sertraline" 21812 21842 1 2 "amitrypyline" 22095 22125 1 2 "oxazepam" 22115 22145 1 2 "paracetamol + codeine" 22173 22280 1 2 "diazepam" 21552 21582 2 2 "risperidone" 21552 21612 2 3 "quetiapine" 21552 21612 2 3 "risperidone" 21552 21582 2 3 "tramadol" 21878 21908 3 3 "oxycodone" 21885 21915 2 3 "paracetamol + codeine" 21885 21915 2 4 "fluoxetine" 22292 22322 1 4 "quetiapine" 22323 22503 2 4 "buprenorphine" 22323 22503 2 5 "sertraline" 22250 22430 3 5 "paracetamol + codeine" 22250 58954 3 5 "gabapentin" 22328 22358 2 5 "mirtzapine" 22328 22508 2 end format %td scriptdate format %td enddate

I have used SSC rangerun to calculate the number of distinct values ("ndistinct", shown in dataset above) of concurrent medication using the following code:

program myprog
qui tab drugname
gen ndistinct = r(r)
end

rangerun myprog if dbi==1, use(drugname) inter(scripdate 0 enddate) by(id)

We now want to calculate the total time an individual is "at risk" as denoted by different values of ndistinct. The problem is that whilst rangerun currently identifies the maximum number of distinct values within a time interval, for example, 4 concurrent medications, it is not easy to determine for how long in that interval the individual was taking 4 medications, as opposed to 3 or 2. This is important for us to calculate as it provides additional information to decide the cut-off value of ndistinct.

I am struggling to find examples of how others have dealt with similar issues and would greatly appreciate any suggestions.

Many thanks, Jean.
Tags: None

Clyde Schechter

Join Date: Apr 2014
Posts: 30355

22 Aug 2022, 10:15

The following code will tell you the number of active prescriptions in place for each id in exhaustive ranges of dates.

Code:

assert scriptdate <= enddate
gen long obs_no = _n
reshape long @date, i(obs_no) j(script_end) string
by id (date), sort: gen active_rxs = sum(script_end == "script") - sum(script_end == "end")
collapse (max) active_rxs, by(id date)
by id (date): gen until = date[_n+1] - 1
format until %td
rename date from

Comment

Jean Spinks

Join Date: Aug 2020

Posts: 4
#3

22 Aug 2022, 23:22

Hi Clyde, thanks so much, what an elegant solution!!

Much appreciated,

Best, Jean.
Comment

Announcement

Calculating cumulative time at risk for a rolling window and varying number of distinct values

Comment

Comment