Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting string variable

    Hello
    I have a variable called parent with the name of the parent firm. I want to make a descriptive statistic of number of entries. Therefore I need to count how many times a company name in the variable "parent" is listed 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 times and in the interval 10-20 and 20-30. However, I'm having trouble doing this. Any suggestions?
    Best regards, Frederik

  • #2
    What did you do and what exactly was the trouble? Please read section 12 of the FAQ (http://www.statalist.org/forums/help) regarding posting sample data using dataex.

    Without knowing anything, collapse would be a potential candidate:

    Code:
    clear
    input str5 comp
    a
    a
    a
    b
    b
    c
    c
    d
    d
    d
    d
    e
    e
    e
    e
    end
    
    gen freq = 1
    
    collapse (sum) freq, by(comp)
    
    recode freq (1/3 = 1 "1-3 times") (4/6 = 2 "4-6 times"), gen(freq_cat)
    
    list
    Results:

    Code:
         +-------------------------+
         | comp   freq    freq_cat |
         |-------------------------|
      1. |    a      3   1-3 times |
      2. |    b      2   1-3 times |
      3. |    c      2   1-3 times |
      4. |    d      4   4-6 times |
      5. |    e      4   4-6 times |
         +-------------------------+

    Comment


    • #3
      not sure exactly what you're asking, but something like this might be a start.

      egen parentN = count(parent)

      If you want an indicator by values of that, then recode would work..

      g parentG = parentN

      recode parentG (11 12 13 14 15 16 17 18 19 20 = 20) (21 22 23 24 25 26 27 28 29 30 = 30)

      Code:
      clear
      set obs 30
      g parentN = _n
      g parentG = parentN
      recode parentG (11 12 13 14 15 16 17 18 19 20 = 20) (21 22 23 24 25 26 27 28 29 30 = 30)

      Comment

      Working...
      X