drop observations that have overlap in sample period

Alice Yang

Join Date: Mar 2022

Posts: 69
#1

drop observations that have overlap in sample period

22 Aug 2023, 22:34

Dear statalist,

I have some data that I want to turn into panel data, that is, for each firm year, there is only 1 observation. My data is based on firm pairs, a focal firm (focalfirm), a matched firm (gvkey), and for each pair of firms, I create a symbol variable to assign this pair a unique ID. I find that some firm pairs have 2 events within the sample period, I only want to preserve the first event and the associated pre & post periods (2 pre-years with post1 = 0 and 3 post-years with post1 = 1), how can I drop the second event and the 5 years of observations related to that?

For example, for the first 10 entries, I want to preserve the 5-year period from 2004 to 2008, and delete the second 5-year period from 2006-2010, because for the same pair of firms (same focalfirm and gvkey, therefore same symbol), there is an overlap for 2006, 2007, 2008.

Thanks a lot for any kind help!

focalfirm eventdate gvkey fyear post1 eventyear symbol
135990 30/05/06 11657 2004 0 2006 147647
135990 30/05/06 11657 2005 0 2006 147647
135990 30/05/06 11657 2006 1 2006 147647
135990 30/05/06 11657 2007 1 2006 147647
135990 30/05/06 11657 2008 1 2006 147647
135990 19/10/08 11657 2006 0 2008 147647
135990 19/10/08 11657 2007 0 2008 147647
135990 19/10/08 11657 2008 1 2008 147647
135990 19/10/08 11657 2009 1 2008 147647
135990 19/10/08 11657 2010 1 2008 147647

135990 30/05/06 14477 2004 0 2006 150467
135990 30/05/06 14477 2005 0 2006 150467
135990 30/05/06 14477 2006 1 2006 150467
135990 30/05/06 14477 2007 1 2006 150467
135990 30/05/06 14477 2008 1 2006 150467
135990 19/10/08 14477 2006 0 2008 150467
135990 19/10/08 14477 2007 0 2008 150467
135990 19/10/08 14477 2008 1 2008 150467
135990 19/10/08 14477 2009 1 2008 150467
135990 19/10/08 14477 2010 1 2008 150467
Tags: None

Clyde Schechter

Join Date: Apr 2014
Posts: 30164

23 Aug 2023, 11:31

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input long focalfirm str8 eventdate int(gvkey fyear) byte post1 int eventyear long symbol
135990 "30/05/06" 11657 2004 0 2006 147647
135990 "30/05/06" 11657 2005 0 2006 147647
135990 "30/05/06" 11657 2006 1 2006 147647
135990 "30/05/06" 11657 2007 1 2006 147647
135990 "30/05/06" 11657 2008 1 2006 147647
135990 "19/10/08" 11657 2006 0 2008 147647
135990 "19/10/08" 11657 2007 0 2008 147647
135990 "19/10/08" 11657 2008 1 2008 147647
135990 "19/10/08" 11657 2009 1 2008 147647
135990 "19/10/08" 11657 2010 1 2008 147647
135990 "30/05/06" 14477 2004 0 2006 150467
135990 "30/05/06" 14477 2005 0 2006 150467
135990 "30/05/06" 14477 2006 1 2006 150467
135990 "30/05/06" 14477 2007 1 2006 150467
135990 "30/05/06" 14477 2008 1 2006 150467
135990 "19/10/08" 14477 2006 0 2008 150467
135990 "19/10/08" 14477 2007 0 2008 150467
135990 "19/10/08" 14477 2008 1 2008 150467
135990 "19/10/08" 14477 2009 1 2008 150467
135990 "19/10/08" 14477 2010 1 2008 150467
end

by focalfirm gvkey (eventyear fyear), sort: gen event_num = sum(eventyear != eventyear[_n-1])
frame put focalfirm gvkey event_num eventyear, into(events)
frame events {
duplicates drop
    by focalfirm gvkey (event_num eventyear), sort: gen byte to_drop = ///
        eventyear - eventyear[_n-1] < 5
    drop if to_drop
}
frlink m:1 focalfirm gvkey event_num, frame(events)
keep if !missing(events)
drop events
frame drop events

In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

Announcement

drop observations that have overlap in sample period

Comment