Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Grouping unique values of one variable by a second variable

    Hello, Stata Community,

    I am having trouble getting Stata to do what I need it to. I have a dataset with a long list of company names and a variable that identifies unique compounds. The issue I have is that the unique compound identifier is market level, so the same value may appear several times for the same company and again for separate companies. What I am trying to do is generate a new variable which assigns company level unique compound identifiers based on grouping the market level unique identifiers by company.

    For example:
    My data looks like this:
    Code:
    input byte unique str20 company
    37 "Aurobindo Pharma"
    42 "Aurobindo Pharma"
    42 "Aurobindo Pharma"
    20 "Avanthi Inc"     
    20 "Avema Pharma"    
     4 "Barr"            
     4 "Barr"            
     4 "Barr"            
     4 "Barr"            
     4 "Barr"            
    35 "Barr"            
    end

    And what I need is:
    Code:
    input byte unique str20 company byte cunique
    37 "Aurobindo Pharma" 1
    42 "Aurobindo Pharma" 2
    42 "Aurobindo Pharma" 2
    20 "Avanthi Inc" 1    
    20 "Avema Pharma" 1   
     4 "Barr" 1           
     4 "Barr" 1           
     4 "Barr" 1           
     4 "Barr" 1           
     4 "Barr" 1           
    35 "Barr" 2           
    end
    If I could use the by command with the egen, group() command it would be exactly what I need. Anyone know a way around this?

    Thanks!
    Last edited by Jared Thorpe; 22 Sep 2017, 15:48.

  • #2
    Code:
    by company (unique), sort: gen cunique = sum(unique != unique[_n-1])

    Comment


    • #3
      Thank you!!!

      Comment

      Working...
      X