Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Count unique numbers of a varlist

    Dear statalists,

    I have following dataset and would like to count the unique number (remove duplicates) of analyst (according to the analyst code) with respect to the firm for each year.
    firm id analyst code year
    00130H10 9494 1996
    00130H10 7422 1996
    00130H10 1.80E+04 1996
    00130H10 9946 1996
    00130H10 3.20E+04 1996
    00130H10 9946 1996
    00130H10 3.20E+04 1996
    00130H10 3.10E+04 1996
    00130H10 9932 1996
    00130H10 7422 1996
    00130H10 3.20E+04 1996
    00130H10 9932 1996
    00130H10 3.10E+04 1996
    00130H10 7422 1996
    00130H10 0 1997
    00130H10 1.80E+04 1997
    00130H10 9946 1997
    00130H10 9946 1997
    00130H10 1646 1997
    00130H10 556 1997
    00130H10 3.10E+04 1997
    00130H10 9494 1997
    00130H10 3.20E+04 1997
    00130H10 3.10E+04 1997
    00130H10 1112 1997
    00130H10 9494 1997
    00130H10 9946 1997
    00130H10 3.20E+04 1997
    00130H10 1112 1997
    Many thanks in advance.

    Best!!

  • #2
    There is plenty of ways to do this, and it all depends on what you want.
    If you just want the number to be displayed, just use the user written -distinct- command, available on SSC.

    If you want this number to be saved in a variable, use the egen, nvals() command.

    Anyway, you everything you need is written here :
    http://www.stata.com/support/faqs/da...-observations/

    Charlie

    Comment


    • #3
      Cong:
      what follows can do the trick:
      Code:
      bysort analystcode: keep if [_n]==1
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Cong: You got good advice, except that your data appear messed up and it is probably unwise to do anything until that is reversed.

        Your analyst code variable looks like a string variable and has values like "1.80E+04". Here E indicates 10, so 1.80E+04 could be any integer from 17500 to 18499, or so I guess, so distinct codes may well have got mushed together.

        On the program distinct, the Stata Journal references given below are clickable if you repeat the command in Stata:

        Code:
        .        search distinct, sj
        
        Search of official help files, FAQs, Examples, SJs, and STBs
        
        SJ-12-2 dm0042_1  . . . . . . . . . . . . . . . . Software update for distinct
                (help distinct if installed)  . . . . . .  N. J. Cox and G. M. Longton
                Q2/12   SJ 12(2):352
                options added to restrict output to variables with a minimum
                or maximum of distinct values
        
        SJ-8-4  dm0042  . . . . . . . . . . . .  Speaking Stata: Distinct observations
                (help distinct if installed)  . . . . . .  N. J. Cox and G. M. Longton
                Q4/08   SJ 8(4):557--568
                shows how to answer questions about distinct observations
                from first principles; provides a convenience command

        Comment


        • #5
          Dear Charlie, Carlo and Nick, many thanks for your kind help.

          Comment

          Working...
          X