Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Determine number of unique values groupwise

    Hey community,

    I have grouped my observations by the variable companyid so that there are multiple directors working for ONE company. I now want to determine the top 100 companies with the largest director workforce, that is: I want to find those companies with the highest number of unique director IDs.

    Thanks a lot in Advance!

    Marie

  • #2
    Code:
    // create some example data
    clear
    set obs 10
    gen company_id = _n
    expand 6
    gen director_id = ceil(runiform()*6)
    
    sort company_id director_id
    list, sepby(company_id)
    
    // create number of distinct directors
    bysort company_id director_id : gen     n = (_n == 1)
    by     company_id             : replace n = sum(n)
    by     company_id             : replace n = n[_N]
    
    list, sepby(company_id)
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Another way with company_id director_id as variables (@Maarten Buis)

      Code:
      bysort company_id director_id : keep if _n == 1 
      contract company_id 
      gsort -_freq 
      list in 1/100

      Comment


      • #4
        For more general discussion, including terminology, see https://www.stata-journal.com/sjpdf....iclenum=dm0042

        Comment


        • #5
          Thank you very much!!

          Comment

          Working...
          X