Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting and tagging duplicates in ascending order

    Hi,

    I'm auditing head imaging requests in my hospital. Some patients will have had multiple scans. I have the scan results in long format (Date / ida / Scan type ( MRI / CT) / Request string / Result string ) and I want to stratify my analysis by scan and also report the number of individuals who have had however many scans.

    I have tagged the number of occurrences of each unique anonymised id (ida) by date using the code below to give me scan number (snum).

    bysort ida (date): gen snum = _n

    'tab snum' tell me how many individuals have had at least 1 or 2 or 3 scans but I don't know how many people have had exactly 1 or 2 or 3 scans. I tried reshaping to wide but had to delete all the other variables and it didn't really work. No doubt, Stata 15 includes an elegant solution but for the moment I'm stuck and as ever, I'd be grateful for any help.

    Many thanks

    Ali

    Ophthalmology Registrar, QMC, Nottingham

  • #2
    how about:
    Code:
    egen countsnum=count(ida), by(ida)

    Comment


    • #3
      If I understand this correctly then

      Code:
      bysort ida : gen nscans = _N
      gives the number of scans. But for each personl scanned 2, 3, 4, ... times that number is repeated that many times. This is why egen, tag() was introduced -- in Stata 7 after being user-written.

      Code:
      egen tag = tag(ida)
      tags each patient just once after which

      Code:
      tab nscans if tag
      shows how many people had each number of scans.

      Comment


      • #4
        Fantastic ! Both work

        Code:
        egen countsnum=count(ida), by(ida)
        counts every instance of ida and gives me total number of scans (95k). To derive patients (58k), I divide freq by countsnum (divide freq of 1 scan (40k) by 1, freq of 2 scans (20k) by 2 = 10k, freq of 3 scan (10k) by 3 = 3.3k, etc... all the way to the poor blighter who had 42 scans (42 /42 = 1)

        Code:
        tab nscans if tag
        elegantly avoids the need for further calculation, showing how many people had each number of scans 1=40k, 2=10k, 3=3.3k, etc..

        Thank you so much

        Ali
        Last edited by Ali Poostchi; 06 Jun 2017, 12:05.

        Comment


        • #5
          note that after sorting by ida, you can always use the following without tagging:
          Code:
          ta countsnum if ida!=ida[_n-1]
          or certain variations thereof

          Comment

          Working...
          X