Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculate monthly percentages using long dataset

    Hello,

    I have a long dataset that includes many observations for each person. There is a date variable (date) that corresponds to whether or not the person (ID) transmitted a value. The other variables include the start date (implantdate) and the end date (today). I am interested in calculating a monthly percentage of transmissions across all participants (ie, the count of the "date" variables for each month starting with the "implantdate" and ending with "today"). Since the start date is different for each person, I would like to calculate this as "Month 1" Month 2" etc. I am including a dataex example below (although unfortunately it only has data for one patient- is there a way to include data for many different IDs within the dataex? Maybe that's a separate discussion post!)

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte id float(date implantdate today)
    1 21573 20130 21638
    1 21571 20130 21638
    1 21570 20130 21638
    1 21568 20130 21638
    1 21566 20130 21638
    1 21565 20130 21638
    1 21564 20130 21638
    1 21546 20130 21638
    1 21531 20130 21638
    1 21530 20130 21638
    1 21529 20130 21638
    1 21528 20130 21638
    1 21527 20130 21638
    1 21526 20130 21638
    1 21525 20130 21638
    1 21524 20130 21638
    1 21523 20130 21638
    1 21508 20130 21638
    1 21504 20130 21638
    1 21503 20130 21638
    1 21502 20130 21638
    1 21501 20130 21638
    1 21500 20130 21638
    1 21498 20130 21638
    1 21497 20130 21638
    1 21496 20130 21638
    1 21495 20130 21638
    1 21494 20130 21638
    1 21493 20130 21638
    1 21492 20130 21638
    1 21491 20130 21638
    1 21490 20130 21638
    1 21489 20130 21638
    1 21488 20130 21638
    1 21487 20130 21638
    1 21486 20130 21638
    1 21485 20130 21638
    1 21484 20130 21638
    1 21482 20130 21638
    1 21480 20130 21638
    1 21479 20130 21638
    1 21478 20130 21638
    1 21477 20130 21638
    1 21476 20130 21638
    1 21472 20130 21638
    1 21468 20130 21638
    1 21467 20130 21638
    1 21463 20130 21638
    1 21462 20130 21638
    1 21461 20130 21638
    1 21460 20130 21638
    1 21459 20130 21638
    1 21458 20130 21638
    1 21456 20130 21638
    1 21455 20130 21638
    1 21454 20130 21638
    1 21453 20130 21638
    1 21452 20130 21638
    1 21449 20130 21638
    1 21448 20130 21638
    1 21447 20130 21638
    1 21446 20130 21638
    1 21445 20130 21638
    1 21444 20130 21638
    1 21442 20130 21638
    1 21441 20130 21638
    1 21440 20130 21638
    1 21439 20130 21638
    1 21437 20130 21638
    1 21436 20130 21638
    1 21434 20130 21638
    1 21433 20130 21638
    1 21432 20130 21638
    1 21431 20130 21638
    1 21429 20130 21638
    1 21428 20130 21638
    1 21427 20130 21638
    1 21426 20130 21638
    1 21425 20130 21638
    1 21424 20130 21638
    1 21423 20130 21638
    1 21422 20130 21638
    1 21421 20130 21638
    1 21420 20130 21638
    1 21419 20130 21638
    1 21418 20130 21638
    1 21417 20130 21638
    1 21416 20130 21638
    1 21414 20130 21638
    1 21413 20130 21638
    1 21412 20130 21638
    1 21411 20130 21638
    1 21410 20130 21638
    1 21409 20130 21638
    1 21408 20130 21638
    1 21407 20130 21638
    1 21406 20130 21638
    1 21405 20130 21638
    1 21404 20130 21638
    1 21403 20130 21638
    end
    format %td date
    format %td implantdate

  • #2
    Sarah Haynes I'm afraid I can't follow what you want to compute. Could you create a small example of the output you need?

    Comment


    • #3
      Hi Hemanshu!

      Does this help?

      I am looking to fill in a simple table like this:
      Total transmission adherence across participants ((total sum of "date")/(total sum of days between "implantdate" and "today"
      Month 1
      Month 2
      Month 3
      Month 4
      Month 5
      Month N

      Comment


      • #4
        I'm still struggling to understand. What are "month 1", "month 2", etc? What exactly do you mean by "total sum of 'date'"?

        Comment


        • #5
          Sorry I'm not explaining well! Month 1 for each person would be the the month starting with the implant date. So for this person in the dataex, the implant date is 2/11/15 so Month 1 would be any transmissions occurring between 2/11/15 and 3/10/15. This is 28 days so 28 would be the denominator of interest. Between these dates, there were 28 transmissions (represented by the "date" variable) so that would be the numerator. And then I'm interested in the sum of the numerator and denominator across all participants. Does that make sense?

          Comment


          • #6
            Okay, here's some code that might help. I'm not sure that it really does the job, since the data example is somewhat inadequate, but here's a shot. The code assumes that for any ID, there is only a single implantdate.

            Code:
            format %td date implantdate today
            gen byte is_transmission_day = 1
            xtset id date
            tsfill
            replace is_transmission_day = 0 if missing(is_transmission_day)
            
            foreach var in implantdate today {
                bys id: egen _`var' = max(`var')
                replace `var' = _`var'
                drop _`var'    
            }
            
            gen transmission_month = datediff(implantdate,date,"m")
            bys id transmission_month: egen days_in_transmission_month = count(id)
            bys id transmission_month: egen transmission_days_in_month = total(is_transmission_day)
            
            duplicates drop id transmission_month, force
            collapse (sum) days_in_transmission_month transmission_days_in_month, by(transmission_month)
            
            gen adherence = transmission_days_in_month/days_in_transmission_month
            format %4.3f adherence
            Last edited by Hemanshu Kumar; 12 Sep 2022, 13:54.

            Comment


            • #7
              i

              Comment


              • #8
                Wow, thank you!! This looks exactly right. The datediff command is really useful here. I really appreciate all of the effort you put into this- it's really helpful!

                Comment

                Working...
                X