Keep only pairs of treated and control firms that appear in both years

Pantelis Kazakis

Join Date: Aug 2014

Posts: 123
#1

Keep only pairs of treated and control firms that appear in both years

22 Apr 2023, 09:18

Hi Statalist members,

Assume that you have a dataset of the following format:

Code:

input pair firm_id treated year 4 4977869 0 2000 4 4977869 0 2001 4 4977869 0 2002 4 4977869 0 2003 4 4977869 0 2004 4 4977869 0 2005 4 5031421 1 2000 4 5031421 1 2002 4 5031421 1 2004 4 5031421 1 2005 end

For the above, I would like to keep only the observations where the pair of firms appear in all relevant years together. For example, for firm_id = 4977869, we have two additional years (2001 and 2003) that do not appear for its pair (firm_id = 5031421). Ergo, how can we keep only the obs that appear for the pair for both years?

The final database should look like this:

pair firm_id treated year

4 4977869 0 2000

4 4977869 0 2002

4 4977869 0 2004

4 4977869 0 2005

4 5031421 1 2000

4 5031421 1 2002

4 5031421 1 2004

4 5031421 1 2005

Thank you.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30174
#2

22 Apr 2023, 09:56

Code:

isid firm_id year by pair year (firm_id), sort: keep if _N == 2 sort pair firm_id year

Note: In order for this code to produce correct results, firm_id and year must uniquely identify observations in the data. This is true in the example. And from the general description, it is probably true in the full data set as well. But to avoid mistaken results arising unwittingly, I have included a check on this at the start of the code. If the -isid- command halts with an error message, then you either need to decide whether there are supposed to be multiple observations for the same firm_id in the same year (in which case this code needs revision), or there shouldn't be any such (in which case the data set is corrupted and needs to be fixed.)

Last edited by Clyde Schechter; 22 Apr 2023, 10:00.
1 like
Comment

pair	firm_id	treated	year
4	4977869	0	2000
4	4977869	0	2002
4	4977869	0	2004
4	4977869	0	2005
4	5031421	1	2000
4	5031421	1	2002
4	5031421	1	2004
4	5031421	1	2005

Announcement

Keep only pairs of treated and control firms that appear in both years

Comment