Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Count the number of different values (ignoring missing)

    Hello everyone,

    I want to count the number of different values ignoring missing by pid.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long pid byte age float seqage
     201 45 2
     201 46 2
     201 47 2
     201 48 2
     201 49 2
     201 50 3
     201 51 .
     201 52 4
     201 53 4
     201 54 4
     201 55 4
    1101 45 5
    1101 46 5
    1101 47 5
    1101 48 5
    1101 49 5
    1101 50 5
    1101 51 5
    1101 52 5
    1101 53 5
    1101 54 5
    1101 55 5
    end
    This is how my data looks.

    In case of pid 201, the number of different values (2, 3, 4) is three.
    In case of pid 1101, the number of different values (5) is one.


    Thank you very much for your help.

    Halim.

  • #2
    Thanks for the data example. Here you go:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long pid byte age float seqage
     201 45 2
     201 46 2
     201 47 2
     201 48 2
     201 49 2
     201 50 3
     201 51 .
     201 52 4
     201 53 4
     201 54 4
     201 55 4
    1101 45 5
    1101 46 5
    1101 47 5
    1101 48 5
    1101 49 5
    1101 50 5
    1101 51 5
    1101 52 5
    1101 53 5
    1101 54 5
    1101 55 5
    end
    
    egen tag = tag(pid seqage) 
    egen wanted = total(tag), by(pid)
    
    tabdisp pid, c(wanted) 
    
    ----------------------
          pid |     wanted
    ----------+-----------
          201 |          3
         1101 |          1
    ----------------------

    This is a much discussed question. For a fairly pedestrian discussion, see https://www.stata-journal.com/articl...article=dm0042 i.e.

    SJ-8-4 dm0042 . . . . . . . . . . . . Speaking Stata: Distinct observations
    (help distinct if installed) . . . . . . N. J. Cox and G. M. Longton
    Q4/08 SJ 8(4):557--568
    shows how to answer questions about distinct observations
    from first principles; provides a convenience command


    dm0042 is thus revealed as an otherwise unpredictable search term to find related threads here.


    Note that missings don't bite in your example, but ignoring them is easy


    Code:
    egen tag = tag(pid seqage) if !missing(seqage)

    Comment


    • #3
      You are helping me so much while I'm writing my thesis, Nick

      Thank you again.

      Comment

      Working...
      X