Calculating readmission rates

Yevgeniy Feyman

Join Date: Jul 2016

Posts: 32
#1

Calculating readmission rates

26 Jul 2016, 10:31

Hi fellow statalisters,

Hoping you can offer some assistance with this, as it's turned out to be more complicated than I initially thought.

I'm working with the HCUP State Inpatient Database for 2013. I'm trying to calculate 30-day readmission rates.

I have two primary variables:

visitlink: this is a unique patient identifier across the year
daystoevent: this is an encrypted date variable

This is the process for calculating readmission rate:

1. Identify "index" hospitalizations. These are identified as follows: 1) the first hospitalization for any patient 2) a hospitalization occurring more than 30 days after/before any other index hospitalization

2. Identify readmissions. These are hospitalizations occurring 30 days or less after an index hospitalization. Only the first instance of this is counted as a hospitalization (this means that the maximum readmission rate can be 100%).

Example, if hospitalizations happen on the following days:

Day 5: index
Day 10: readmission
Day 20: neither a readmission nor an index
Day 40: index

I don't think I've properly identified all index hospitalizations, and I'm having some trouble identifying readmissions in a systematic way.

The basic idea I've implemented so far is this:
1) Identify all first hospitalizations as index (this is easy, and done)
2) Identify any 2nd hospitalization occurring 30 days or fewer after the first index hospitalization as a readmission.
3) Identify any 2nd hospitalization occurring more than 30 days after the first index hospitalization as an index hospitalization.

But this isn't very systematic and doesn't generalize.

Hoping that someone hear has ideas!

Thanks!
Tags: None
Yevgeniy Feyman

Join Date: Jul 2016

Posts: 32
#2

26 Jul 2016, 14:32

Any ideas?
Comment
Robert Picard

Join Date: Mar 2014

Posts: 1536
#3

26 Jul 2016, 15:05

You will increase your chances of getting a useful answer if you provide a small data example, preferably generated using dataex (from SSC). To install it, type in Stata's Command window:

Code:

ssc install dataex

You should then show the code you use to identify index hospitalizations. For the other cases that you do not quite know how to code, include the expected results so that those who would like to help can benchmark their solutions.
Comment
Yevgeniy Feyman

Join Date: Jul 2016

Posts: 32
#4

26 Jul 2016, 15:24

Got it! I've posted the code that I've been using below. Unfortunately, I can't install stata programs as I'm using the software from a remote server that's only connected to the intranet.

One note: in all the code I include a "if visitlink!=." qualifier because some 50k (out of 2.3 million) observations don't have a patient identifier.

EDIT: It seems to me that what this code misses is cases where an index hospitalization would be within <=30 days of a non-index, but >30 from the previous index. Ideally, for each value of "visitlink" I'd have stata identify a "reference" hospitalization, calculate the time to the next one, the one after that etc. until that difference is >30. Then, label that observation as "index=1" and continue the process.

Unfortunately, I can't say with 100% certainty what the total counts should come out to, as the methods used to do this vary by organization, and they aren't always consistent.

Code:

*for each patient, generate # of days between each hospitalization* sort visitlink daystoevent by visitlink: gen admdiff=daystoevent-daystoevent[_n-1] if visitlink!=. *for each patient, generate each hospitalization's count* sort visitlink daystoevent by visitlink: gen admcount=_n if visitlink!=. *generate the index indicator variable, leave it missing for missing patient ids* gen index=. replace index=0 if visitlink!=. *call any first hospitalization an index* replace index=1 if admcount==1 & visitlink!=. *if hospitalization is more than 30 days after previous, call it an index* replace index=1 if admdiff>30 & visitlink!=. *alternative approach to above: check if there's a hospitalization w/in 30 days + or -* sort visitlink daystoevent by visitlink: replace index=1 if daystoevent[_n+1]-daystoevent[_n]>30 | daystoevent[_n]-daystoevent[_n-1]>30 & visitlink!=. sort visitlink daystoevent by visitlink: replace index=0 if daystoevent[_n+1]-daystoevent[_n]<=30 | daystoevent[_n]-daystoevent[_n-1]<=30 & visitlink!=.

Last edited by Yevgeniy Feyman; 26 Jul 2016, 15:33.
Comment
Yevgeniy Feyman

Join Date: Jul 2016

Posts: 32
#5

26 Jul 2016, 15:43

Ahh, sorry, I misunderstood. Here's a small example with fake data (I can't post real data due to restrictions on use):

Code:

obsid visitlink daystoevent index readmission

1 1 1 1 0

2 1 15 0 1

3 1 34 1 0

4 1 35 0 1

5 1 60 0 0

6 2 5 1 0

7 2 23 0 1

8 2 30 0 0

9 2 59 1 0

So in this case, observation 9 for instance, is an index. But because it's only 29 days away from observation 8, my code will identify it as a non-index hospitalization. Same for observation 5.
Comment

Robert Picard

Join Date: Mar 2014
Posts: 1536

27 Jul 2016, 08:54

You can still use user-written programs when Stata itself has no internet access. You can secure copies of the necessary files via the Statistical Software Components page at IDEAS using the following link:

Code:

https://ideas.repec.org/s/boc/bocode.html

If you do no have write access to Stata's system directories, you can place the downloaded files in the same directory where you place your do-files and change Stata's current directory to point to that directory. Stata should always be able to find programs located in the current directory. If you want a more general solution, type

Code:

help adopath

for information on how to add a directory that you have read/write access to the ado-path.

With respect to your question, the task of identifying index cases is a bit tricky if the 30 days window is relative to a previous index case. Here's a solution that iterates to find the next index case until no new case is found. Once all index cases are found, then the first readmission case is simply the observation that follows an index case (and is not itself an index case).

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(obsid visitlink daystoevent index readmission)
 1 1  1 1 0
 2 1 15 0 1
 3 1 34 1 0
 4 1 35 0 1
 5 1 60 0 0
 6 2  5 1 0
 7 2 23 0 1
 8 2 30 0 0
 9 2 59 1 0
10 2 60 0 1
11 2 90 1 0
end

* order events by date, check that data is fully sorted
isid visitlink daystoevent obsid, sort

* the first obs is always an index
by visitlink: gen idx = _n == 1

* look for the next index case, i.e. more than 30 days from previous case
local more 1
while `more' {

    // this is what we start from
    clonevar idx0 = idx
    
    // carry over daystoevent from previous index case(s)
    gen idx_days = daystoevent * idx
    by visitlink: replace idx_days = idx_days[_n-1] if idx_days == 0
    
    // number of days since last index case
    gen delta = daystoevent - idx_days
    
    // add a new index case
    by visitlink: replace idx = 1 if delta > 30 & delta[_n-1] <= 30
    
    // if we did not find a new case, we are done
    count if idx0 != idx
    local more = r(N)
    
    drop idx0 delta idx_days
    
}

* look for readmission cases, only flag the first after an index case
by visitlink: gen readmt = idx == 0 & idx[_n-1] == 1

Comment

Yevgeniy Feyman

Join Date: Jul 2016

Posts: 32
#7

27 Jul 2016, 09:39

Thanks so much! This works perfectly.
Comment
marc georgi

Join Date: Feb 2017

Posts: 3
#8

09 Feb 2017, 10:08

Hi Robert,
I read with interest the solution you provided to calculating the readmission rate requested by Yevgeniy. I tried it myself and it works perfectly if you have only one large group of patients. However, I tried to limit the analysis to a subgroup of patients, that I called group 1. I added a variable that defines who is in group 1 (group1=1) and who is not (group1=0). I used this data:

input float(obsid visitlink daystoevent index readmission group1)
1 1 1 1 0 1 2 1 15 0 1 1 3 1 34 1 0 0 4 1 35 0 1 0 5 1 60 0 0 0 6 2 5 1 0 1 7 2 23 0 1 0 8 2 30 0 0 0 9 2 59 1 0 0 10 2 60 0 1 1 11 2 90 1 0 1 end I modified the program only in one place as follows:
by visitlink: gen idx = _n == 1 if group1==1 I am afraid I got the wrong answers and the index and readmission columns did not match the program generated idx and readmt columns. Any ideas on what went wrong? most importantly, any suggestions on how to fix it? Thanks a lot!
Comment

marc georgi

Join Date: Feb 2017
Posts: 3

09 Feb 2017, 10:19

Oups, sorry for the format above, here is the data I used:

HTML Code:

input float(obsid visitlink daystoevent index readmission group1) 
 1 1 1  1 0 1
 2 1 15 0 1 1
 3 1 34 1 0 0
 4 1 35 0 1 0
 5 1 60 0 0 0
 6 2 5  1 0 1
 7 2 23 0 1 0
 8 2 30 0 0 0
 9 2 59 1 0 0
10 2 60 0 1 1
11 2 90 1 0 1

Comment

Robert Picard

Join Date: Mar 2014
Posts: 1536

#10

09 Feb 2017, 15:11

Well if you want to limit the calculations to a subgroup of observations, the easiest way is to drop all observations that are not part of the subgroup and then merge back the results. It should look something like:

Code:

clear
input float(obsid visitlink daystoevent index readmission group1) 
 1 1 1  1 0 1
 2 1 15 0 1 1
 3 1 34 1 0 0
 4 1 35 0 1 0
 5 1 60 0 0 0
 6 2 5  1 0 1
 7 2 23 0 1 0
 8 2 30 0 0 0
 9 2 59 1 0 0
10 2 60 0 1 1
11 2 90 1 0 1
end

* order events by date, check that data is fully sorted
isid visitlink daystoevent obsid, sort

* save the master data and reduce to target subgroup
save "master_data.dta", replace
keep if group1

* ------- repeat as before ------------
* the first obs is always an index
by visitlink: gen idx = _n == 1

* look for the next index case, i.e. more than 30 days from previous case
local more 1
while `more' {

    // this is what we start from
    clonevar idx0 = idx
    
    // carry over daystoevent from previous index case(s)
    gen idx_days = daystoevent * idx
    by visitlink: replace idx_days = idx_days[_n-1] if idx_days == 0
    
    // number of days since last index case
    gen delta = daystoevent - idx_days
    
    // add a new index case
    by visitlink: replace idx = 1 if delta > 30 & delta[_n-1] <= 30
    
    // if we did not find a new case, we are done
    count if idx0 != idx
    local more = r(N)
    
    drop idx0 delta idx_days
    
}

* look for readmission cases, only flag the first after an index case
by visitlink: gen readmt = idx == 0 & idx[_n-1] == 1

* -------  and merge the results back with the master data
keep visitlink daystoevent obsid idx readmt
merge 1:1 visitlink daystoevent obsid using "master_data.dta", ///
    assert(match using) nogen

* reorder as before if desired
isid visitlink daystoevent obsid, sort

Comment

marc georgi

Join Date: Feb 2017

Posts: 3
#11

20 Feb 2017, 21:33

Thank you for the reply. I will try it on my data and see what happens!
Comment
Haroon Janjua

Join Date: Jul 2018

Posts: 1
#12

09 Jul 2018, 09:37

Hi Robert,

Can the above code for readmission be duplicated for HCUP SID databases, I am also trying to calculate 30 day readmission for a project, but at the end of pulling data I did not get reasonable number for the readmission variable. Only 3 incidences of readmission for Liver Patients for 3 years of data combined for 6 states.
Comment

Announcement