Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to drop observations without having both pre and post years

    Hello

    In the sample selection process, I only want to preserve a firm if it has data for both pre and post year (I'll use them for DID test later). So the relevant variables are firm, year, and post (= 1 if the year is a post year, = 0 for pre year). I want to drop those with only pre or post year, such as the bolded ones in the examples below, how can I do this?

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long firm float(year post)
      4 2018 0
      4 2019 1
      7 2013 0
      7 2014 1
      9 2015 0
      9 2016 1
     32 2019 0
     32 2020 1
     35 2012 0
     35 2013 1
     40 2016 0
     40 2017 1
     59 2018 0
     59 2019 1
     62 2016 0
     62 2017 1
     62 2018 0
     62 2019 1
     65 2016 0
     65 2017 1
     68 2014 0
     68 2015 1
     70 2014 0
     70 2015 1
     90 2015 0
     90 2016 1
    151 2010 1
    153 2013 0
    153 2014 1
    156 2018 0
    156 2019 1
    158 2013 0
    158 2014 1
    401 2010 1
    401 2013 0
    401 2014 1
    407 2015 0
    407 2016 1
    408 2010 1
    408 2013 0
    408 2014 1
    411 2016 0
    411 2017 1
    416 2013 0
    416 2014 1
    426 2013 0
    426 2014 1
    426 2018 0
    426 2019 1
    498 2018 0
    498 2019 1
    503 2018 0
    503 2019 1
    518 2016 0
    518 2017 1
    523 2017 0
    523 2018 1
    524 2013 0
    524 2014 1
    544 2014 0
    544 2015 1
    544 2016 0
    544 2017 1
    546 2013 0
    546 2014 1
    546 2015 0
    546 2016 1
    547 2012 0
    547 2013 1
    557 2013 0
    557 2014 1
    560 2010 0
    560 2011 1
    564 2013 0
    564 2014 1
    564 2018 0
    564 2019 1
    566 2015 0
    566 2016 1
    573 2010 1
    582 2013 0
    582 2014 1
    592 2011 1
    593 2014 0
    593 2015 1
    593 2018 0
    593 2019 1
    606 2010 0
    606 2011 1
    607 2015 0
    607 2016 1
    607 2017 0
    607 2018 1
    608 2010 1
    609 2018 0
    609 2019 1
    612 2013 0
    612 2014 1
    616 2014 0
    616 2015 1
    end
    format %ty year

  • #2
    Here are two solutions:

    For the first, just code
    Code:
    bys firm (year): gen byte to_keep = (year - year[_n-1] == 1 | year[_n+1] - year == 1)
    As an alternative, you could use the community-contributed command rangestat, which can be installed using
    Code:
    net install rangestat.pkg
    After installation, the code to use is simply:
    Code:
    rangestat (count) to_keep = year, interval(year -1 1) by(firm) excludeself
    Either of these produces:
    Code:
    . li firm year if !to_keep, noobs sep(0)
      +-------------+
      | firm   year |
      |-------------|
      |  151   2010 |
      |  401   2010 |
      |  408   2010 |
      |  573   2010 |
      |  592   2011 |
      |  608   2010 |
      +-------------+
    You can then just
    Code:
    drop if !to_keep
    Last edited by Hemanshu Kumar; 30 Nov 2022, 23:07.

    Comment


    • #3
      Thanks for the reply, both works well!

      Comment


      • #4
        Hi again,

        I found some new problems with both approaches above. After running the codes, the following firms with only pre or post year are still in the sample, I think the difference with the examples in #1 is that this time the same firm has other years with both pre and post. Maybe this code can somehow be improved?
        Code:
        firm   year    post
        636    2014    0
        636    2015    1
        638    2010    1
        638    2011    0
        638    2012    1
        656    2019    0
        656    2020    1
        300145    2018    0
        300145    2019    1
        300147    2011    1
        300147    2012    0
        300147    2013    1
        300147    2015    0
        300147    2016    1
        So in the examples above, firm 638 in 2010 is labelled as post, although there is year 2011 for firm 638, that one is a pre year, and it is paired with firm 638 in 2012 (post), so firm 638 in 2010 should be dropped.

        Comment


        • #5
          Code:
          bys firm (year): gen byte to_drop = (post == 1 & missing(year[_n-1])) | (post == 0 & missing(year[_n+1]))
          bys firm (year): gen byte to_keep = (year - year[_n-1] == 1 | year[_n+1] - year == 1) & !to_drop
          drop to_drop
          which produces the following (using a dataset combining your original sample with the one in #4):
          Code:
          . li firm year if !to_keep, noobs sep(0)
            +---------------+
            |   firm   year |
            |---------------|
            |    151   2010 |
            |    401   2010 |
            |    408   2010 |
            |    573   2010 |
            |    592   2011 |
            |    608   2010 |
            |    638   2010 |
            | 300147   2011 |
            +---------------+

          Comment


          • #6
            Hi Hemanshu,

            Thanks a lot for helping me again. However, I found some more occasions where an observation should have been dropped but did not get dropped. Because my sample is not large, I manually checked all observations and believe these 3 cases are the only problems left.
            Code:
            firm    year    post    to_drop    to_keep
            600139    2012    1    1    0
            600139    2013    1    0    1
            600236    2013    1    1    0
            600236    2016    1    0    0
            600561    2014    1    1    0
            600561    2015    1    0    1
            Thanks a lot in advance : )

            Comment


            • #7
              See if this works:
              Code:
              sort firm year
              by firm: gen byte to_drop = (post == 1 & missing(year[_n-1])) | (post == 0 & missing(year[_n+1]))
              by firm: replace to_drop = !(post - post[_n-1] == 1 | post[_n+1] - post == 1) if year - year[_n-1] == 1 & !to_drop
              by firm: gen byte to_keep = (year - year[_n-1] == 1 | year[_n+1] - year == 1) & !to_drop
              drop to_drop

              Comment


              • #8
                It works this time, thanks a lot for your help and patience.

                Comment

                Working...
                X