Hi Statalist members,
Assume that you have a dataset of the following format:
For the above, I would like to keep only the observations where the pair of firms appear in all relevant years together. For example, for firm_id = 4977869, we have two additional years (2001 and 2003) that do not appear for its pair (firm_id = 5031421). Ergo, how can we keep only the obs that appear for the pair for both years?
The final database should look like this:
Thank you.
Assume that you have a dataset of the following format:
Code:
input pair firm_id treated year 4 4977869 0 2000 4 4977869 0 2001 4 4977869 0 2002 4 4977869 0 2003 4 4977869 0 2004 4 4977869 0 2005 4 5031421 1 2000 4 5031421 1 2002 4 5031421 1 2004 4 5031421 1 2005 end
The final database should look like this:
pair | firm_id | treated | year |
4 | 4977869 | 0 | 2000 |
4 | 4977869 | 0 | 2002 |
4 | 4977869 | 0 | 2004 |
4 | 4977869 | 0 | 2005 |
4 | 5031421 | 1 | 2000 |
4 | 5031421 | 1 | 2002 |
4 | 5031421 | 1 | 2004 |
4 | 5031421 | 1 | 2005 |
Thank you.
Comment