Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Count unique number of firms

    Dear Statalisters,

    My panel data is organized by firm year industry. Below is a sample demo of my data.
    Question: how do I count the number of unique firms that have event_d dummy ==1? Expected results is 3 , for firm 1001, 1002, 1006.

    input firm str2 industry year event event_d
    1001 A 1998 0 1
    1001 A 2000 0 1
    1001 A 1999 0 1
    1001 A 1997 1 1
    1002 A 1999 0 1
    1002 A 2000 0 1
    1002 A 1998 1 1
    1006 B 1997 0 1
    1006 B 1998 0 1
    1006 B 1999 1 1
    1008 C 1998 0 0
    1008 C 1997 0 0
    end



    Thanks so much!
    Rochelle

  • #2
    Unique, according to many dictionaries and style guides, means occurring once only. Here and elsewhere I commend the term "distinct".

    Code:
    . input firm str2 industry year event event_d
    
               firm   industry       year      event    event_d
      1. 1001 A 1998 0 1
      2. 1001 A 2000 0 1
      3. 1001 A 1999 0 1
      4. 1001 A 1997 1 1
      5. 1002 A 1999 0 1
      6. 1002 A 2000 0 1
      7. 1002 A 1998 1 1
      8. 1006 B 1997 0 1
      9. 1006 B 1998 0 1
     10. 1006 B 1999 1 1
     11. 1008 C 1998 0 0
     12. 1008 C 1997 0 0
     13. end
    
    . distinct firm if event_d
    
    -----------------------------
          |     total   distinct
    ------+----------------------
     firm |        10          3
    -----------------------------
    For distinct, see a Stata Journal article by Gary Longton and myself, and updates:

    FAQ . . . . . . . . . . . . . . . . . . . Number of distinct observations
    . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox and G. Longton
    10/08 How do I compute the number of distinct observations?
    http://www.stata.com/support/faqs/data-management/
    number-of-distinct-observations/

    FAQ . . . . . . . . . Counting distinct strings across a set of variables
    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
    11/06 How do I count the number of distinct strings
    across a set of variables?
    http://www.stata.com/support/faqs/data-management/
    counting-distinct-strings/

    FAQ . . . . . . . . . . . . . . Calculating the number of distinct values
    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
    9/06 How do I calculate the number of distinct
    values seen so far?
    http://www.stata.com/support/faqs/data-management/
    calculating-number-of-distinct-values/

    SJ-12-2 dm0042_1 . . . . . . . . . . . . . . . . Software update for distinct
    (help distinct if installed) . . . . . . N. J. Cox and G. M. Longton
    Q2/12 SJ 12(2):352
    options added to restrict output to variables with a minimum
    or maximum of distinct values

    SJ-8-4 dm0042 . . . . . . . . . . . . Speaking Stata: Distinct observations
    (help distinct if installed) . . . . . . N. J. Cox and G. M. Longton
    Q4/08 SJ 8(4):557--568
    shows how to answer questions about distinct observations
    from first principles; provides a convenience command

    Rochelle: You were pointed to this article in http://www.statalist.org/forums/foru...up-2-variables on 19 July.
    Last edited by Nick Cox; 17 Aug 2015, 08:26.

    Comment


    • #3
      Both Rochelle and Nick mean "distinct", and Nick has provided the right solution. But if anyone gets to this thread in the future looking for "unique" meaning "occurring only once", here is a solution for that:

      Code:
      clear all
      
      input firm str2 industry year event event_d
          1001 A 1998 0 1
          1001 A 2000 0 1
          1001 A 1999 0 1
          1001 A 1997 1 1
          1002 A 1999 0 1
          1002 A 2000 0 1
          1002 A 1998 1 1
          1006 B 1997 0 1
          1006 B 1998 0 1
          1006 B 1999 1 1
          1008 C 1998 0 0
          1009 C 1997 0 0
      end
      
      program define keepunique
          syntax varlist
          sort `varlist'
          tempvar c
          by `varlist': generate long `c'=_N
          keep if `c'==1
      end
      
          keepunique firm
      Best, Sergiy Radyakin

      PS: note that Rochelle has posted a properly coded data example, which made it easy to get started with coding the solution.

      Comment


      • #4
        Thanks Nick ! Thanks Sergiy !


        @Nick, I installed distinct , I also follow your earlier articles and used

        by event_d firm, sort: gen nvals = _n == 1
        by event_d : replace nvals = sum(nvals)
        by event_d : replace nvals = nvals[_N]

        Both of them work !



        @ Sergiy, I will keep your code in my back pocket for future reference . Thanks again !

        Comment

        Working...
        X