Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Advice for analyzing the nationwide readmissions database

    Key NRD visit link Time to event
    1 1 50
    2 1 60
    3 1 70
    4 1 80
    5 2 500
    6 2 510
    7 2 520
    Dear all,

    I recently started working with the Healthcare cost and utilization project Nationwide readmissions database. This database looks at hospital readmissions for patients hospitalized to all US hospitals.

    The data is structured as illustrated above. The "Key" variable is a unique record for each hospitalization. The same patient hospitalized twice will have two different values of "key". The NRD visit link is a variable that identifies a unique patient across hospitalization, in the data above key 1 2 3 4 refer to a single patient admitted 4 times.

    the "time to event" variable is the number of days between different hospitalizations. To protect patient privacy, true dates are not provided. the time to event is consistent across one patient but not different patients.

    For example, in the data above, we can say that patient identified by NRD visit link value of 1 was hospitalized at 0, 10, 20 and 30 days but the absolute value of "time to event" has no meaning.
    Similarly we can tell that patient NRD visit link 2 was admitted at 0, 10 and 20 days but the absolute numbers of "time to event" do not matter.

    My question is, how can I generate a new variable that contains the difference between the first hospitalization and last for each patient? I want to create a fourth column in the above data that contain the difference between values of "time to event" and the smallest value of "time to event" for each patient as identified by "NRD visit link".

    Thank you so much!



  • #2
    In the future please give examples by using the -dataex- command (-ssc install dataex-, -help dataex- for instructions on its use) so that we have real variable names, information about data storage types, etc.

    So I will make informed guesses about what your variable names might be in the code below.

    My question is, how can I generate a new variable that contains the difference between the first hospitalization and last for each patient? I want to create a fourth column in the above data that contain the difference between values of "time to event" and the smallest value of "time to event" for each patient as identified by "NRD visit link".
    These are two different things, so the code below creates two different variables

    Code:
    by NRD_visit_link (time_to_event), sort: gen elapsed_first_to_last = time_to_event[_N] - time_to_event[1]
    by NRD_visit_link (time_to_event): gen: days_first_to_current_hosp = time_to_event - time_to_event[1]
    Do read the User's Guide [U], section 27.2 in the on-line manuals to familiarize yourself with the use of -by- and the behavior of 1, _n (not used here), and _N in commands with -by:- prefixes generally.

    Comment

    Working...
    X