Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • couting unique observation

    id sales profit year size_group
    a 36 9 1991 1
    a 48 17 1992 1
    a 25 7 1993 2
    b 65 18 1991 1
    b 30 8 1992 2
    b 45 20 1993 1
    Dear all
    I have the above panel dataset for demonstration purpose and I would like to count the unique/ distinct ids in the size_group==1, during the year 1991. For example, in the year 1991, there are 2 unique ids(a,b) that comes under size_group==1.
    I tried
    Code:
    count if size_group==1
    But it is not what I really want. What should be the command?
    Any help in this regard will be highly helpful

  • #2
    Here is your data example done the dataex way. (In your case, a copy and paste works fine; for e.g. date variables or numeric variables with value labels that would not be the case.)


    Code:
    clear
    input str1 id byte(sales profit) int year byte size_group
    "a" 36  9 1991 1
    "a" 48 17 1992 1
    "a" 25  7 1993 2
    "b" 65 18 1991 1
    "b" 30  8 1992 2
    "b" 45 20 1993 1
    end
    Good news: there is a distinct command you can download and it answers your specific question. Where to get it and what to read are spelled out later.

    Code:
    . distinct id if year == 1991 & size_group == 1
    
    ---------------------------
        |     total   distinct
    ----+----------------------
     id |         2          2
    ---------------------------
    Although people often ask questions like this, it then may turn out that they want the results to go in new variables. The code below has been possible in Stata since 1999 or perhaps even earlier but it took me part of that time to realise that it is a fairly natural, general and flexible approach.

    Code:
    . egen tag = tag(id year size_group)
    
    . egen wanted1 = total(tag), by(size_group)
    
    . tabdisp size_group, c(wanted1)
    
    ----------------------
    size_grou |
    p         |    wanted1
    ----------+-----------
            1 |          4
            2 |          2
    ----------------------
    
    . egen wanted2 = total(tag), by(size_group year)
    
    . tabdisp year size_group, c(wanted2)
    
    ----------------------
              | size_group
         year |    1     2
    ----------+-----------
         1991 |    2      
         1992 |    1     1
         1993 |    1     1
    ----------------------
    All this -- including not only the distinct command but a wider discussion of technique (and even terminology: executive summary is that distinct is a much better word than unique) -- is covered in

    Code:
    . search distinct, sj
    
    Search of official help files, FAQs, Examples, and Stata Journals
    
    SJ-15-3 dm0042_2  . . . . . . . . . . . . . . . . Software update for distinct
            (help distinct if installed)  . . . . . .  N. J. Cox and G. M. Longton
            Q3/15   SJ 15(3):899
            improved table format and display of large numbers of
            observations
    
    SJ-12-2 dm0042_1  . . . . . . . . . . . . . . . . Software update for distinct
            (help distinct if installed)  . . . . . .  N. J. Cox and G. M. Longton
            Q2/12   SJ 12(2):352
            options added to restrict output to variables with a minimum
            or maximum of distinct values
    
    SJ-11-2 dm0057  . . . . . . . . .  Stata tip 99: Taking extra care with encode
            . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Schechter
            Q2/11   SJ 11(2):321--322                                (no commands)
            tip on safely using encode across datasets
    
    SJ-9-1  pr0046  . . . . . . . . . . . . . . . . . . .  Speaking Stata: Rowwise
            (help rowsort, rowranks if installed) . . . . . . . . . . .  N. J. Cox
            Q1/09   SJ 9(1):137--157
            shows how to exploit functions, egen functions, and Mata
            for working rowwise; rowsort and rowranks are introduced
    
    SJ-8-4  dm0042  . . . . . . . . . . . .  Speaking Stata: Distinct observations
            (help distinct if installed)  . . . . . .  N. J. Cox and G. M. Longton
            Q4/08   SJ 8(4):557--568
            shows how to answer questions about distinct observations
            from first principles; provides a convenience command
    
    (end of search)
    
    .
    If you repeat the search command yourself you will get a clickable link to the 2008 paper and to the 2015 version of the software (or one later if it exists when anyone reads this).

    Comment


    • #3
      Thank you very much Nick for helping me as well informing me about the "distinct" command

      Comment

      Working...
      X