I have a slightly unusual problem and I can't think how to run it in Stata or if it is possible.
I have a dataset with two variables. The first "person" contains an integer that corresponds to a specific individual. The second "drug" contains a code for a each possible medicine that the individual has been prescribed. For example
I want to try to identify which drugs tends to be prescribed together (i.e. grouped within 'person'). So in this example, two of the people have both Statin and betablocker.
My test dataset has 110000 observations with 13700 persons and 4500 different drugs but the final dataset is much larger.
Any thoughts much appreciated.
I have a dataset with two variables. The first "person" contains an integer that corresponds to a specific individual. The second "drug" contains a code for a each possible medicine that the individual has been prescribed. For example
Person | Drug |
1 | Aspirin |
1 | Statin |
1 | betablocker |
2 | Statin |
2 | betablocker |
3 | antidepressant |
My test dataset has 110000 observations with 13700 persons and 4500 different drugs but the final dataset is much larger.
Any thoughts much appreciated.
Comment