Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating frequencies from list of 30,000 observations saved as strings

    I have a list of 30,000 plus observations which represent patient diagnoses. They are saved in string format. I would like to know the top 10 most commonly occurring diagnoses. I just cant seem to figure it out. When I use the tabulate command it tells me " Too many values" Summarize it just gives me a "0". How can I go about this?

  • #2
    Let's assume your string variable is called diagnosis. You can do this:

    Code:
    drop if missing(diagnosis)
    by diagnosis, sort: gen frequency = _N
    by diagnosis: keep if _n == 1
    gsort -frequency
    list diagnosis frequency in 1/10
    Added: Is King Solomon your real name? It is the custom in this community to use our real given and family names as our user name, in order to promote collegiality and professionalism. If this is not your real name, please click on Contact Us in the lower right corner of the page and message the forum administrator requesting that he change your username on the account.
    Last edited by Clyde Schechter; 06 Feb 2019, 15:53.

    Comment


    • #3
      Clyde gives excellent advice. Here is another way to do it:

      Code:
      contract diagnosis if !missing(diagnosis) 
      gsort -_freq
      list diagnosis _freq in 1/10
      See also groups from the Stata Journal.

      https://www.statalist.org/forums/for...updated-on-ssc

      Here is a silly example:


      Code:
      . webuse nlswork, clear
      (National Longitudinal Survey.  Young Women 14-26 years of age in 1968)
      
      . groups age, select(10) order(high)
      
        +-------------------------------+
        | age   Freq.   Percent     %<= |
        |-------------------------------|
        |  24    1636      5.74    5.74 |
        |  23    1604      5.63   11.36 |
        |  25    1566      5.49   16.86 |
        |  22    1458      5.11   21.97 |
        |  26    1414      4.96   26.93 |
        |-------------------------------|
        |  21    1317      4.62   31.55 |
        |  27    1317      4.62   36.17 |
        |  35    1264      4.43   40.60 |
        |  29    1251      4.39   44.99 |
        |  28    1250      4.38   49.38 |
        +-------------------------------+

      Comment

      Working...
      X