Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Remove rows if variable value does not have a certain amount of observations

    Hello, sorry if the title is not clear! A small sample of my data can be seen below. My aim is to delete all rows from a certain company (TICKER) if they do not have observations from all years 2010-2020. For example, in the data below, I would keep all the rows for ticker A, however since AAA and AAAP only have data for some years I would want to drop all the rows for those tickers. I have tried to play around with the DROP IF function however I am new to Stata so I'm also having difficulties finding search terms for help on what I'm looking for. Sorry since this has probably been answered previously.

    Code:
    TICKER    year    RET
    A    2010    .360587924
    A    2011    -.105289804
    A    2012    .200218003
    A    2013    .368424385
    A    2014    .002363122
    A    2015    .065968707
    A    2016    .119250455
    A    2017    .40943002
    A    2018    .040816839
    A    2019    .279583614
    A    2020    .367221266
    AA    2010    .055579704
    AA    2011    -.490790864
    AA    2012    .041280666
    AA    2013    .237985016
    AA    2014    .427898743
    AA    2015    -.425617623
    AA    2016    .055984234
    AA    2017    .764060466
    AA    2018    -.64663515
    AA    2019    -.139412225
    AA    2020    .646594437
    AAA    2020    .007631533
    AAAP    2015    .182677761
    AAAP    2016    -.033216692
    AAAP    2017    1.2492474
    AAAP    2018    -.014702334

  • #2

    Code:
    bysort TICKER : drop if _N < 11

    Comment


    • #3
      This assumes that observations may be outside the range 2010-2020. Otherwise, see Nick's code.

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str4 ticker int year double ret
      "A"    2010  .360587924
      "A"    2011 -.105289804
      "A"    2012  .200218003
      "A"    2013  .368424385
      "A"    2014  .002363122
      "A"    2015  .065968707
      "A"    2016  .119250455
      "A"    2017   .40943002
      "A"    2018  .040816839
      "A"    2019  .279583614
      "A"    2020  .367221266
      "AA"   2010  .055579704
      "AA"   2011 -.490790864
      "AA"   2012  .041280666
      "AA"   2013  .237985016
      "AA"   2014  .427898743
      "AA"   2015 -.425617623
      "AA"   2016  .055984234
      "AA"   2017  .764060466
      "AA"   2018  -.64663515
      "AA"   2019 -.139412225
      "AA"   2020  .646594437
      "AAA"  2020  .007631533
      "AAAP" 2015  .182677761
      "AAAP" 2016 -.033216692
      "AAAP" 2017   1.2492474
      "AAAP" 2018 -.014702334
      end
      
      
      isid ticker year
      gen decade10s= inrange(year, 2010, 2020)
      bys ticker decade10s (year): egen nobs= count(ticker) if decade10s  
      bys ticker (nobs): replace nobs= nobs[1]
      keep if nobs==11
      Res.:

      Code:
      . l, sepby(ticker)
      
           +----------------------------------------------+
           | ticker   year          ret   deca~10s   nobs |
           |----------------------------------------------|
        1. |      A   2010    .36058792          1     11 |
        2. |      A   2011    -.1052898          1     11 |
        3. |      A   2012      .200218          1     11 |
        4. |      A   2013    .36842439          1     11 |
        5. |      A   2014    .00236312          1     11 |
        6. |      A   2015    .06596871          1     11 |
        7. |      A   2016    .11925046          1     11 |
        8. |      A   2017    .40943002          1     11 |
        9. |      A   2018    .04081684          1     11 |
       10. |      A   2019    .27958361          1     11 |
       11. |      A   2020    .36722127          1     11 |
           |----------------------------------------------|
       12. |     AA   2010     .0555797          1     11 |
       13. |     AA   2011   -.49079086          1     11 |
       14. |     AA   2012    .04128067          1     11 |
       15. |     AA   2013    .23798502          1     11 |
       16. |     AA   2014    .42789874          1     11 |
       17. |     AA   2015   -.42561762          1     11 |
       18. |     AA   2016    .05598423          1     11 |
       19. |     AA   2017    .76406047          1     11 |
       20. |     AA   2018   -.64663515          1     11 |
       21. |     AA   2019   -.13941223          1     11 |
       22. |     AA   2020    .64659444          1     11 |
           +----------------------------------------------+

      Comment


      • #4
        Thank you so much both of you!

        Comment

        Working...
        X