Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Identifying first ID entry and ID exit in a panel dataset

    Hi,
    I am working with a panel of firms (identified with idfinal) across years (year). The panel is not balanced, so there is entry of new firms every year.
    I would like to construct two identifiers, one which would tell me, for each year, what firms are showing up for the first time; and another that would indicate what firms are showing up for the last time in the data. What code can write?

    Please find below a sample of my dataset. For example, following the below data, I would like to generate two variables:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double(idfinal year) float prod_male
     1 2016     20
     2 2017     14
     3 2013   4.25
     3 2017     51
     4 2017     12
     6 2012    105
     6 2013  70.75
     6 2014 150.75
     6 2015     91
     6 2016  84.75
     7 2012   47.5
     7 2013   47.5
     8 2012      8
     8 2013      8
     9 2012     27
     9 2013     90
     9 2014    100
     9 2016     86
     9 2017 100.25
    10 2010  35.25
    10 2011     90
    10 2012     80
    10 2013   86.5
    10 2014     87
    10 2015   33.5
    11 1996 111.75
    11 1998 111.75
    11 1999     71
    11 2000   74.5
    11 2001   74.5
    11 2002   74.5
    11 2003   74.5
    11 2004  35.25
    11 2005  35.25
    11 2006     38
    11 2007  23.75
    11 2008     35
    11 2009      .
    11 2010     22
    11 2012   19.5
    11 2013  18.25
    11 2014  22.75
    12 2013 119.75
    12 2014 119.25
    13 2011 161.75
    13 2012 129.75
    13 2013 129.75
    13 2014     24
    14 1996    391
    14 1998 311.25
    14 1999 419.75
    14 2000 419.75
    14 2001    402
    14 2002    472
    14 2003    450
    14 2004 446.25
    14 2005 426.25
    14 2006    377
    14 2007    672
    14 2008     84
    14 2009      .
    14 2010 639.75
    14 2011    953
    14 2012 637.75
    14 2013  721.5
    14 2014   1032
    15 2012 408.75
    15 2013     26
    15 2014     26
    15 2015    6.5
    15 2016     10
    15 2017    8.5
    16 1996  26.75
    16 1998     25
    16 1999     11
    16 2000     31
    16 2001   34.5
    16 2002   19.5
    16 2003    9.5
    16 2004  20.25
    16 2006   19.5
    16 2007   19.5
    16 2008   19.5
    16 2009      .
    16 2010   7.25
    16 2012     10
    16 2013   6.75
    16 2014      8
    16 2015     17
    17 2011     59
    17 2012     59
    17 2013     25
    17 2014     25
    17 2015  15.75
    18 2011  18.75
    18 2013   20.5
    18 2014   29.5
    18 2015     10
    18 2016  18.25
    18 2017     36
    end
    Thanks,
    Marina

  • #2
    Code:
    bys idfinal (year): gen first=_n==1
    bys idfinal (year): gen last=_n==_N
    Then, e.g., which firms entered the sample in 2013?

    Code:
    list idfinal if first & year==2013, sep(0)

    Which firms exited the sample in 2017?

    Code:
    list idfinal if last & year==2017, sep(0)
    Res.:


    Code:
    . list idfinal if first & year==2013, sep(0)
    
         +---------+
         | idfinal |
         |---------|
      3. |       3 |
     43. |      12 |
         +---------+
    
    . list idfinal if last & year==2017, sep(0)
    
         +---------+
         | idfinal |
         |---------|
      2. |       2 |
      4. |       3 |
      5. |       4 |
     19. |       9 |
     72. |      15 |
    100. |      18 |
         +---------+
    
    .

    Comment


    • #3
      Also,

      Code:
      tab year if first 
      
      tab year if last
      and so on. And also

      Code:
      egen firstyear = total(first * year), by(idfinal) 
      egen lastyear = total(last * year), by(idfinal) 
      egen tag = tag(idfinal) 
      
      tab firstyear lastyear if tag

      Comment


      • #4
        Hi Andrew and Nick,
        Thank you so much for your help.

        Comment

        Working...
        X