Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Identifying most common combinations of values within another variable

    I have a slightly unusual problem and I can't think how to run it in Stata or if it is possible.

    I have a dataset with two variables. The first "person" contains an integer that corresponds to a specific individual. The second "drug" contains a code for a each possible medicine that the individual has been prescribed. For example
    Person Drug
    1 Aspirin
    1 Statin
    1 betablocker
    2 Statin
    2 betablocker
    3 antidepressant
    I want to try to identify which drugs tends to be prescribed together (i.e. grouped within 'person'). So in this example, two of the people have both Statin and betablocker.

    My test dataset has 110000 observations with 13700 persons and 4500 different drugs but the final dataset is much larger.

    Any thoughts much appreciated.

  • #2
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte person str14 drug
    1 "Aspirin"       
    1 "Statin"        
    1 "betablocker"   
    2 "Statin"        
    2 "betablocker"   
    3 "antidepressant"
    end
    
    preserve
    rename drug drug2
    tempfile 2
    save `2'
    restore, preserve
    joinby person using `2'
    bys person: drop if drug>=drug2
    contract person drug drug2, freq(freq)
    contract drug drug2
    gsort -_freq
    list, sep(0)
    *restore
    Res.:

    Code:
    . list, sep(0)
    
         +-------------------------------+
         |    drug         drug2   _freq |
         |-------------------------------|
      1. |  Statin   betablocker       2 |
      2. | Aspirin        Statin       1 |
      3. | Aspirin   betablocker       1 |
         +-------------------------------+

    Comment


    • #3
      Many thanks - that takes me a huge leap forward. I'm thinking of using NodeXL to try to cluster up these pairwise combinations. But is that something Stata can also do?

      Comment


      • #4
        I am not knowledgeable in network analysis, but you can have a look at

        Code:
        search nwcommands
        and see whether this suite of commands does what you want. Otherwise, start a new thread with an informative title and explain what you want to achieve. Those who are knowledgeable in this area may be able to help.

        Comment

        Working...
        X