Panel Data - Keeping Observations when using a Difference in Difference Approach

Joshua Le Cornu

Join Date: Feb 2019

Posts: 1
#1

Panel Data - Keeping Observations when using a Difference in Difference Approach

08 Feb 2019, 04:25

I have UK panel data from the British Household Panel Survey and Understanding data, covering 2000-2016 in a variable "year" with individual identifiers "pidp".

How do I keep those variables that appear at least once before 2007 and at least once after 2007 - I am looking the impact of a UK policy that came into effect and hence am trying to limit the sample to those individuals that appear both before and after the threshold?

I have tried a combination of the following, limiting to those who only answered for than say 9 years to ensure they are before and after but this greatly limits the data:
bysort pidp : drop if _N < 2 //*13,495
bysort pidp : drop if _N < 3 //*16, 530
bysort pidp : drop if _N < 4 //*17,190
bysort pidp : drop if _N < 5 //*15,936
bysort pidp : drop if _N < 6 //*18,120
bysort pidp : drop if _N < 7 //*21,522
bysort pidp : drop if _N < 8 //*28,924
bysort pidp : drop if _N < 9 //*32,080

Thanks,
Joshua
Tags: None

Clyde Schechter

08 Feb 2019, 09:54

Code:

by pidp, sort: egen before_2007 = max(year < 2007)
by pidp: egen after_2007 = max(year > 2007)
keep if before_2007 & after_2007

Announcement