Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • proportions

    Dear all,


    I would like to find out:
    1. Separate proportion of each drug classes then each drugs over entire time and by year
    2. Proportion of drug classes by broad age group.

    Data looks as below:
    ID Antiplatelet drugs Lipid lowering agents Supply date
    Aspirin Clopidgrel Heparin
    0001 1 0 1 0 May 2002
    0001 0 0 0 1 March 2008
    0001 0 0 1 0 June 2016
    0001 0 0 0 0 Nov 2013
    0001 1 1 0 0 Sep 2003
    0002 0 1 0 0 July 2007
    0002 0 1 0 1 March 2006
    0002 0 0 0 1 Jan 2009
    0002 1 0 0 0 Feb 2010
    0002 1 1 0 0 June 2011
    I would really appreciate If you may help me to solve the problem.

    Thank you in advance.

  • #2
    Your data display is unusable for developing and testing code. In the future please use -dataex- as requested in FAQ #12 to show example data that will be importable to Stata and will faithfully reproduce your Stata data set.

    Your question is unclear in some respects. When calculating proportions by year, how should people who do not appear in the data at all for a given year be treated? For example, ID 0002 has no entries in the year 2008. Should ID 0002 be treated as not taking any of these drugs in 2008, or should ID 0002 simply not be counted either way for that year?

    Below is some code that will work assuming the answer to the last question is that ID 0002 should not be counted either way. Since the display you show is not from an actual Stata data set (it can't be--spaces aren't allowed in variable names), I have made assumptions about what your data look like. If those are wrong, the code will not work, but if the above layout is reasonably suggestive, you should be able to modify the code to make it work with what you have.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(id aspirin clopidogrel heparin lipid_lowering) str6 supply_date
    1 1 0 1 0 "May-02"
    1 0 0 0 1 "Mar-08"
    1 0 0 1 0 "Jun-16"
    1 0 0 0 0 "Nov-13"
    1 1 1 0 0 "Sep-03"
    2 0 1 0 0 "Jul-07"
    2 0 1 0 1 "Mar-06"
    2 0 0 0 1 "Jan-09"
    2 1 0 0 0 "Feb-10"
    2 1 1 0 0 "Jun-11"
    end
    
    //    CREATE A STATA INTERNAL FORMAT MONTHLY DATE
    gen date = monthly(supply_date, "M20Y")
    format date %tm
    //    AND EXTRACT THE YEAR
    gen year = year(date)
    
    
    
    //    FLAG ONE OBSERVATION PER PERSON
    egen flag = tag(id)
    //    CREATE A VARIABLE FOR EVER TAKING A DRUG
    //    AND IDENTIFY PROPORTION OF SUCH PEOPLE
    foreach d of varlist aspirin-lipid_lowering {
        by id, sort: egen ever_`d' = max(`d')
        tab ever_`d' if flag
    }
    
    
    //    SAME IDEA PER PERSON PER YEAR
    //    FLAG ONE OBSERVATION PER PERSON PER YEAR
    drop flag ever_*
    egen flag = tag(id year)
    //    IDENTIFY PEOPLE WHO TAKE A DRUG ANY TIME IN A YEAR
    foreach d of varlist aspirin-lipid_lowering {
        by id year, sort: egen ever_`d' = max(`d')
        tab ever_`d' year if flag
    }
    As for your second question, you have provided no information about age groups, so the question cannot be answered.

    Comment


    • #3
      In Clyde's very helpful code sketch year(date) should I think be year(dofm(date))

      Comment


      • #4
        Nick is correct.

        Comment


        • #5
          Thank you so much Mr Schechter and Cox. As suggested in the future I will use -dataex-.

          Comment

          Working...
          X