Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • New on SSC: _gdistinct function to generate a count of distinct values

    Hi all,

    A new egen function is now available on SSC (thanks to Christopher Baum). _gdistinct works with egen to generate a count of the distinct values of a variable list and takes a by option.

    Type ssc install _gdistinct to download the package.

    Then, you can run commands such as:

    Code:
    bysort varlist1: egen int newvar = distinct(varlist2)
    Type help _gdistinct for more information.

    After using routines like this several times in my work, I figured that this might be useful to others.


  • #2
    Thanks for this contribution! It's a good name.


    Compare also this function, which is part of egenmore on SSC. It does date from the time before a by prefix become standard, but the option gives the same functionality.


    Code:
    . ssc type _gnvals.ado
    *! 1.0.1 NJC 20 November 2000
    *! 1.0.0 NJC 20 July 2000
    program define _gnvals
            version 6
            gettoken type 0 : 0
            gettoken g 0 : 0
            gettoken eqs 0 : 0
    
            syntax varlist [if] [in] [, by(varlist) MISSing]
            tempvar touse
            quietly {
                    mark `touse' `if' `in'
                    if "`missing'" == "" {
                            markout `touse' `varlist', strok
                    }
                    sort `touse' `by' `varlist'
                    by `touse' `by' `varlist': gen `type' `g' = _n == 1 if `touse'
                    by `touse' `by' : replace `g' = sum(`g') if `touse'
                    by `touse' `by' : replace `g' = `g'[_N] if `touse'
                    
            }
    end
    In https://www.stata-journal.com/articl...article=dm0042 we showed that a combination of egen, tag() and egen, total() would get you there, thus combining official functions and not obliging any installation.

    Other way round, a distinctgen command was part of a later update to that sequence.

    Code:
    SJ-23-4 dm0042_5  . . . . . . . . . . . . . . . . Software update for distinct
            (help distinct, distinctgen if installed)  N. J. Cox and G. M. Longton
            Q4/23   SJ 23(4):1096
            comments out (and thus removes) a call to clear Mata at the
            close of work, which was frustrating some other projects
            also using Mata
    
    SJ-23-2 dm0042_4  . . . . . . . . . . . . . . . . Software update for distinct
            (help distinct, distinctgen if installed)  N. J. Cox and G. M. Longton
            Q2/23   SJ 23(2):595--596
            most important change is addition of distinctgen command

    Comment

    Working...
    X