Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel Data - Keeping Observations when using a Difference in Difference Approach

    I have UK panel data from the British Household Panel Survey and Understanding data, covering 2000-2016 in a variable "year" with individual identifiers "pidp".

    How do I keep those variables that appear at least once before 2007 and at least once after 2007 - I am looking the impact of a UK policy that came into effect and hence am trying to limit the sample to those individuals that appear both before and after the threshold?

    I have tried a combination of the following, limiting to those who only answered for than say 9 years to ensure they are before and after but this greatly limits the data:
    bysort pidp : drop if _N < 2 //*13,495
    bysort pidp : drop if _N < 3 //*16, 530
    bysort pidp : drop if _N < 4 //*17,190
    bysort pidp : drop if _N < 5 //*15,936
    bysort pidp : drop if _N < 6 //*18,120
    bysort pidp : drop if _N < 7 //*21,522
    bysort pidp : drop if _N < 8 //*28,924
    bysort pidp : drop if _N < 9 //*32,080



    Thanks,
    Joshua


  • #2
    Code:
    by pidp, sort: egen before_2007 = max(year < 2007)
    by pidp: egen after_2007 = max(year > 2007)
    keep if before_2007 & after_2007

    Comment

    Working...
    X