Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting distinct values by group

    Hello,

    I have a data-set with a unique id (permno) and time (date). I have taken the year of this date using year(date) to create a new variable. What I want to do is calculate a variable called count, which counts the number of distinct 'years' by the permno.
    I have an example attached below. My original data is exactly the same, just without the "count" column.

    I tried using

    by permno (year): gen count = _n
    but it is giving me a count for each observation, despite being in the same year.


    Intended output:
    permno date year count
    10001 281986 1986 1
    10001 311986 1986 1
    10001 301986 1986 1
    10001 301986 1986 1
    10001 301986 1986 1
    10001 311986 1986 1
    10001 301987 1987 2
    10001 271987 1987 2
    10001 311987 1987 2
    10001 301987 1987 2
    10001 291987 1987 2
    10001 291988 1988 3
    10001 291988 1988 3
    Any help is appreciated.

    Thanks

  • #2
    In Stata count is called a variable; calling it a column is spreadsheet-speak best reserved for private use.

    Code:
    egen wanted = group(permno year)
    should help.


    In your earlier thread we reminded you of the strong preference here for full real names.
    Last edited by Nick Cox; 25 Sep 2019, 01:14.

    Comment


    • #3
      Thanks for your response Nick. and apologies, but I'm unaware of how to change my name; will I have to make a new account?

      Comment


      • #4
        No; see the FAQ Advice for how to contact the administrators.

        Comment

        Working...
        X