Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to see the highest frequency in each hospital

    Hi,

    I'm searching for a command to list only the highest frequency of duplicated ID in each hospital. I am not interested to see all patients that has same highest duplication, I just want to see the highest regardless of how many have the same number of duplication (if any).

    so for example, if hospital A has several civil IDs that have same highest number of visit (e.g. 145), I only want that number, so that there will be no repetition. So, let say there is 10 hospitals, the list results should be only 10. Hope this is clear now. I would share a sample of the data but it is very sensitive and creating dummy resemblance would take time.

    Any thoughts is appreciated!

  • #2
    I have done this so far:

    Code:
    sort Hospital_Code CID
    by Hospital_Code CID : egen CID_count = total( Hospital_Code == Hospital_Code[_N])
    by Hospital_Code : egen max_count = max(CID_count)
    keep if CID_count == max_count
    list Hospital_Code T_CID CID_count if max_count > 1
    But the list include all other obs that has same highest number of visit, which I don't want.


    * I guess the answer is after that to group it.
    Last edited by Bader Bin Adwan; 26 Sep 2023, 11:27. Reason: I found the answer.

    Comment


    • #3
      Here is an analogue of your problem.

      Code:
      . webuse nlswork, clear
      (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
      
      . bysort race age : gen freq = _N
      
      . bysort race (freq) : gen tag = _n == _N
      
      
      
      . list race age freq if tag, noobs
      
        +--------------------+
        |  race   age   freq |
        |--------------------|
        | White    24   1172 |
        | Black    25    450 |
        | Other    23     21 |
        +--------------------+
      groups from the Stata Journal can do this too; the output is less concise.


      Code:
       
      
      . bysort race: groups race age, order(high) select(1)
      
      ------------------------------------------------------------------------------------
      -> race = White
      
        +-------------------------------+
        |  race   age   Freq.   Percent |
        |-------------------------------|
        | White    24    1172      5.81 |
        +-------------------------------+
      
      ------------------------------------------------------------------------------------
      -> race = Black
      
        +-------------------------------+
        |  race   age   Freq.   Percent |
        |-------------------------------|
        | Black    25     450      5.60 |
        +-------------------------------+
      
      ------------------------------------------------------------------------------------
      -> race = Other
      
        +-------------------------------+
        |  race   age   Freq.   Percent |
        |-------------------------------|
        | Other    23      21      6.93 |
        +-------------------------------+
      In either case, the possibility of ties is neglected. I am going to assume that they are unlikely, and there are fixes any way.

      Comment


      • #4
        Thanks Nick. Is there possible way to order the results from high to low as the results I got is like this:

        Hospital_code_~m T_CID freq |
        |----------------------------------------|
        | A1 282051001351 41 |
        | A2 288022100649 557 |
        | M3 282042200367 29 |
        | F1 309011401489 17 |
        | A3 312082600431 46 |
        |----------------------------------------|
        | J4 253030885858 29 |
        | Ib1 264092245454 94 |
        | Ch1 30904092585 6 |
        | M4 290092002787 16 |
        | P5 26608298789 13 |

        As you can see the results in not ordered from high to low!

        Comment


        • #5
          Code:
          sort freq
          before you list.

          Comment


          • #6
            Thanks again Nick, it works but strangely from low to high

            Comment


            • #7
              Sorry, yes:

              Code:
              gsort -freq

              Comment

              Working...
              X