Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • drop observations that have overlap in sample period

    Dear statalist,

    I have some data that I want to turn into panel data, that is, for each firm year, there is only 1 observation. My data is based on firm pairs, a focal firm (focalfirm), a matched firm (gvkey), and for each pair of firms, I create a symbol variable to assign this pair a unique ID. I find that some firm pairs have 2 events within the sample period, I only want to preserve the first event and the associated pre & post periods (2 pre-years with post1 = 0 and 3 post-years with post1 = 1), how can I drop the second event and the 5 years of observations related to that?

    For example, for the first 10 entries, I want to preserve the 5-year period from 2004 to 2008, and delete the second 5-year period from 2006-2010, because for the same pair of firms (same focalfirm and gvkey, therefore same symbol), there is an overlap for 2006, 2007, 2008.

    Thanks a lot for any kind help!

    focalfirm eventdate gvkey fyear post1 eventyear symbol
    135990 30/05/06 11657 2004 0 2006 147647
    135990 30/05/06 11657 2005 0 2006 147647
    135990 30/05/06 11657 2006 1 2006 147647
    135990 30/05/06 11657 2007 1 2006 147647
    135990 30/05/06 11657 2008 1 2006 147647
    135990 19/10/08 11657 2006 0 2008 147647
    135990 19/10/08 11657 2007 0 2008 147647
    135990 19/10/08 11657 2008 1 2008 147647
    135990 19/10/08 11657 2009 1 2008 147647
    135990 19/10/08 11657 2010 1 2008 147647

    135990 30/05/06 14477 2004 0 2006 150467
    135990 30/05/06 14477 2005 0 2006 150467
    135990 30/05/06 14477 2006 1 2006 150467
    135990 30/05/06 14477 2007 1 2006 150467
    135990 30/05/06 14477 2008 1 2006 150467
    135990 19/10/08 14477 2006 0 2008 150467
    135990 19/10/08 14477 2007 0 2008 150467
    135990 19/10/08 14477 2008 1 2008 150467
    135990 19/10/08 14477 2009 1 2008 150467
    135990 19/10/08 14477 2010 1 2008 150467

  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long focalfirm str8 eventdate int(gvkey fyear) byte post1 int eventyear long symbol
    135990 "30/05/06" 11657 2004 0 2006 147647
    135990 "30/05/06" 11657 2005 0 2006 147647
    135990 "30/05/06" 11657 2006 1 2006 147647
    135990 "30/05/06" 11657 2007 1 2006 147647
    135990 "30/05/06" 11657 2008 1 2006 147647
    135990 "19/10/08" 11657 2006 0 2008 147647
    135990 "19/10/08" 11657 2007 0 2008 147647
    135990 "19/10/08" 11657 2008 1 2008 147647
    135990 "19/10/08" 11657 2009 1 2008 147647
    135990 "19/10/08" 11657 2010 1 2008 147647
    135990 "30/05/06" 14477 2004 0 2006 150467
    135990 "30/05/06" 14477 2005 0 2006 150467
    135990 "30/05/06" 14477 2006 1 2006 150467
    135990 "30/05/06" 14477 2007 1 2006 150467
    135990 "30/05/06" 14477 2008 1 2006 150467
    135990 "19/10/08" 14477 2006 0 2008 150467
    135990 "19/10/08" 14477 2007 0 2008 150467
    135990 "19/10/08" 14477 2008 1 2008 150467
    135990 "19/10/08" 14477 2009 1 2008 150467
    135990 "19/10/08" 14477 2010 1 2008 150467
    end
    
    by focalfirm gvkey (eventyear fyear), sort: gen event_num = sum(eventyear != eventyear[_n-1])
    frame put focalfirm gvkey event_num eventyear, into(events)
    frame events {
    duplicates drop
        by focalfirm gvkey (event_num eventyear), sort: gen byte to_drop = ///
            eventyear - eventyear[_n-1] < 5
        drop if to_drop
    }
    frlink m:1 focalfirm gvkey event_num, frame(events)
    keep if !missing(events)
    drop events
    frame drop events
    In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment

    Working...
    X