Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Deleting entire Firm based on condition from panel data set

    Hello Stata community, I have a panel dataset consisting of many firms over 20 years. Firms are identified by DSCD. I have generated a variable (twonegcashflow) that = 1 in any firm year where cash flows are below 0 and the previous year's cash flows were also below 0. I would like to remove the entire firm from my dataset if in any year the variable "twonegcashflow" takes the value of 1.


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long DSCD int year float twonegcashflow double cfps
    1 1999 .  1.026
    1 2000 .  1.268
    1 2001 .  1.499
    1 2002 .  1.493
    1 2003 .  1.183
    1 2004 .  1.709
    1 2005 .  2.258
    1 2006 .  2.576
    1 2007 .  2.656
    1 2008 .  2.898
    1 2009 .  1.998
    1 2010 .  1.644
    1 2011 .  1.927
    1 2012 .  2.139
    1 2013 .   2.61
    1 2014 .  2.677
    1 2015 .  3.435
    1 2016 .  3.499
    1 2017 .  3.279
    1 2018 .  3.804
    1 2019 .  4.378
    1 2020 .      .
    1 2021 .      .
    2 1999 .  5.134
    2 2000 .  5.443
    2 2001 .  5.444
    2 2002 .  4.871
    2 2003 .  5.251
    2 2004 .  5.821
    2 2005 .  7.139
    2 2006 .   8.35
    2 2007 .  9.211
    2 2008 .  8.615
    2 2009 .  6.729
    2 2010 .   6.47
    2 2011 .  5.807
    2 2012 .  6.195
    2 2013 .  6.898
    2 2014 .  6.767
    2 2015 .   9.84
    2 2016 . 12.122
    2 2017 . 12.171
    2 2018 . 14.214
    2 2019 . 16.786
    2 2020 . 17.909
    2 2021 .      .
    3 1999 .   .009
    3 2000 .  -.432
    3 2001 1 -1.075
    3 2002 1  -.499
    3 2003 1  -.169
    3 2004 1  -.269
    3 2005 .   .119
    3 2006 .   .321
    3 2007 .   .323
    3 2008 .  -.506
    3 2009 .      .
    3 2010 .      .
    3 2011 .      .
    3 2012 .      .
    3 2013 .      .
    3 2014 .      .
    3 2015 .      .
    3 2016 .      .
    3 2017 .      .
    3 2018 .      .
    3 2019 .      .
    3 2020 .      .
    3 2021 .      .
    4 1999 .  1.567
    4 2000 .  2.474
    4 2001 .  2.202
    4 2002 .  2.771
    4 2003 .  2.226
    4 2004 .  2.351
    4 2005 .  2.405
    4 2006 .  2.533
    4 2007 .  2.975
    4 2008 .  2.968
    4 2009 .  4.122
    4 2010 .  3.214
    4 2011 .  3.475
    4 2012 .  3.604
    4 2013 .   3.97
    4 2014 .  2.127
    4 2015 .  1.411
    4 2016 .  1.737
    4 2017 .  2.057
    4 2018 .  2.408
    4 2019 .  1.714
    4 2020 .  1.585
    4 2021 .      .
    5 1999 .   .176
    5 2000 .   .162
    5 2001 .   .201
    5 2002 .   .339
    5 2003 .   .541
    5 2004 .   .583
    5 2005 .   .719
    5 2006 .   .839
    end
    label values DSCD DSCD1
    label def DSCD1 1 "130042", modify
    label def DSCD1 2 "130062", modify
    label def DSCD1 3 "130079", modify
    label def DSCD1 4 "130086", modify
    label def DSCD1 5 "130088", modify

    I have tried to use the following code:

    bysort DSCD year: drop if (twonegcashflow[_N])==1

    However, this only seems to drop the firm years where twonegcashflow =1 and not the entire firm.

    Any guidance would be greatly appreciated.

    Thanks,
    Ammar.

  • #2
    Ammar:
    you may want to try:
    Code:
    . bysort DSCD (year): egen wanted=sum( twonegcashflow)
    
    . drop if wanted>1
    
    
    . list
    
         +------------------------------------------+
         | DSCD   year   twoneg~w     cfps   wanted |
         |------------------------------------------|
      1. |    1   1999          .    1.026        0 |
      2. |    1   2000          .    1.268        0 |
      3. |    1   2001          .    1.499        0 |
      4. |    1   2002          .    1.493        0 |
      5. |    1   2003          .    1.183        0 |
         |------------------------------------------|
      6. |    1   2004          .    1.709        0 |
      7. |    1   2005          .    2.258        0 |
      8. |    1   2006          .    2.576        0 |
      9. |    1   2007          .    2.656        0 |
     10. |    1   2008          .    2.898        0 |
         |------------------------------------------|
     11. |    1   2009          .    1.998        0 |
     12. |    1   2010          .    1.644        0 |
     13. |    1   2011          .    1.927        0 |
     14. |    1   2012          .    2.139        0 |
     15. |    1   2013          .     2.61        0 |
         |------------------------------------------|
     16. |    1   2014          .    2.677        0 |
     17. |    1   2015          .    3.435        0 |
     18. |    1   2016          .    3.499        0 |
     19. |    1   2017          .    3.279        0 |
     20. |    1   2018          .    3.804        0 |
         |------------------------------------------|
     21. |    1   2019          .    4.378        0 |
     22. |    1   2020          .        .        0 |
     23. |    1   2021          .        .        0 |
     24. |    2   1999          .    5.134        0 |
     25. |    2   2000          .    5.443        0 |
         |------------------------------------------|
     26. |    2   2001          .    5.444        0 |
     27. |    2   2002          .    4.871        0 |
     28. |    2   2003          .    5.251        0 |
     29. |    2   2004          .    5.821        0 |
     30. |    2   2005          .    7.139        0 |
         |------------------------------------------|
     31. |    2   2006          .     8.35        0 |
     32. |    2   2007          .    9.211        0 |
     33. |    2   2008          .    8.615        0 |
     34. |    2   2009          .    6.729        0 |
     35. |    2   2010          .     6.47        0 |
         |------------------------------------------|
     36. |    2   2011          .    5.807        0 |
     37. |    2   2012          .    6.195        0 |
     38. |    2   2013          .    6.898        0 |
     39. |    2   2014          .    6.767        0 |
     40. |    2   2015          .     9.84        0 |
         |------------------------------------------|
     41. |    2   2016          .   12.122        0 |
     42. |    2   2017          .   12.171        0 |
     43. |    2   2018          .   14.214        0 |
     44. |    2   2019          .   16.786        0 |
     45. |    2   2020          .   17.909        0 |
         |------------------------------------------|
     46. |    2   2021          .        .        0 |
     47. |    4   1999          .    1.567        0 |
     48. |    4   2000          .    2.474        0 |
     49. |    4   2001          .    2.202        0 |
     50. |    4   2002          .    2.771        0 |
         |------------------------------------------|
     51. |    4   2003          .    2.226        0 |
     52. |    4   2004          .    2.351        0 |
     53. |    4   2005          .    2.405        0 |
     54. |    4   2006          .    2.533        0 |
     55. |    4   2007          .    2.975        0 |
         |------------------------------------------|
     56. |    4   2008          .    2.968        0 |
     57. |    4   2009          .    4.122        0 |
     58. |    4   2010          .    3.214        0 |
     59. |    4   2011          .    3.475        0 |
     60. |    4   2012          .    3.604        0 |
         |------------------------------------------|
     61. |    4   2013          .     3.97        0 |
     62. |    4   2014          .    2.127        0 |
     63. |    4   2015          .    1.411        0 |
     64. |    4   2016          .    1.737        0 |
     65. |    4   2017          .    2.057        0 |
         |------------------------------------------|
     66. |    4   2018          .    2.408        0 |
     67. |    4   2019          .    1.714        0 |
     68. |    4   2020          .    1.585        0 |
     69. |    4   2021          .        .        0 |
     70. |    5   1999          .     .176        0 |
         |------------------------------------------|
     71. |    5   2000          .     .162        0 |
     72. |    5   2001          .     .201        0 |
     73. |    5   2002          .     .339        0 |
     74. |    5   2003          .     .541        0 |
     75. |    5   2004          .     .583        0 |
         |------------------------------------------|
     76. |    5   2005          .     .719        0 |
     77. |    5   2006          .     .839        0 |
         +------------------------------------------+
    
    .
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Carlo, thanks very much for your help, the code works perfectly.

      Ammar.

      Comment


      • #4
        Hi Carlo Lazzaro. Just as a follow up, why would I not do:

        drop if wanted>0 Rather than: drop if wanted>1 Lets say a firm has only 1 observation where "twonegcashflow" =1. By running "drop if wanted > 1" am I not keeping that firm in the dataset when I want to remove it? Thanks.

        Comment


        • #5
          Ammar:
          if I have understood you research aim well, you want to delete each panel in which the variable -twonegcashflow- takes on the value 1 in any of the year the panel is investigated.
          Hence, you may want to try;
          [CODE
          bysort DSCD (year): egen wanted=sum( twonegcashflow
          drop if wanted>1 & _N>1
          drop if wanted==1 & _N==1
          [/CODE]
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            Thanks Carlo!!

            Comment

            Working...
            X