Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Trying to combine two files that are linked by a common ID#

    Hi all,

    I'm new to Stata and would appreciate any direction here! I'm currently working with two files. One is a master list of Medicaid recipient cohort, and the second one is a list of prescription claims for some of those people in the Master list. Each beneficiary in the master file may have 0 up to hundreds of prescriptions in the second file. They are deidentified but both files contain a linked ID variable.

    I was wondering if anyone had advice on how I could create the Rx measures at the person level, then add them as new variables to the master file.

  • #2
    Dear Meghan,

    You can use merge function to combine two datasets.

    another option is using joinby

    cheers

    Comment


    • #3
      Hi Meghan,

      So I will start with the dataset that has each patient and their medications:
      Code:
      * I created some toy data with the most common medications
      dataex id medication  //  Data shared using -dataex-. To install: ssc install dataex
      clear
      input byte id str14 medication
      1 "amiodarone"    
      1 "atorvastatin"  
      1 "carvedilol"    
      1 "clopidogrel"  
      1 "lisinopril"    
      2 "levothyroxine"
      2 "prednisone"    
      2 "metformin"    
      2 "Insulin Lispro"
      3 "rosuvastatin"  
      3 "Warfarin"      
      3 "Nystatin"      
      4 "Celecoxib"    
      4 "Docusate"      
      end
      
      bysort id (medication): gen count = _N
      list, sepby(id) noobs
      
        +-----------------------------+
        | id       medication   count |
        |-----------------------------|
        |  1       amiodarone       5 |
        |  1     atorvastatin       5 |
        |  1       carvedilol       5 |
        |  1      clopidogrel       5 |
        |  1       lisinopril       5 |
        |-----------------------------|
        |  2   Insulin Lispro       4 |
        |  2    levothyroxine       4 |
        |  2        metformin       4 |
        |  2       prednisone       4 |
        |-----------------------------|
        |  3         Nystatin       3 |
        |  3         Warfarin       3 |
        |  3     rosuvastatin       3 |
        |-----------------------------|
        |  4        Celecoxib       2 |
        |  4         Docusate       2 |
        +-----------------------------+
      
      * Collapsing down to 1 line per patient (with a count of their medications)
      * Be sure to save your data before this point because collapse deletes data!
      collapse (count) count_meds = count, by(id)
      list, noobs abbrev(12)
      
        +-----------------+
        | id   count_meds |
        |-----------------|
        |  1            5 |
        |  2            4 |
        |  3            3 |
        |  4            2 |
        +-----------------+
      
      * Save this new collapsed dataset
      * This can then be merged using a 1:1 merge into your master dataset with patient characteristics

      Comment

      Working...
      X