Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Delete non-zero changes in data

    Hi everyone,

    I have one more dynamic screen that I need to apply to my data and I can't figure it out.

    I have a variable Name containing the name of the company. I have a variable mofd containing the month (in Stata language) and i have a variable returns, containing the returns for each company in each month.
    I need to 'delete observations of stocks that show non-zero price changes in less than 50% of the traded months in previous 12 months'. I am looking at monthly stock returns for different companies. I use the timeframe 01-01-2013 till 01-05-2022. So this means that I will need to check non-zero price changes from 2021m6 till 2022m5. In these 12 months I want to count the number of zero returns. If the number of zero returns is bigger or equal to 6 (so the opposite of 'non-zero price changes in less than 50%), I want to delete the observations that have non-zero price changes.

    So in my head the steps would be:
    1. check how many zero-return observations a company has between 2021m6 till 2022m5.
    2. if equal or more than 6...
    3. delete other observations between 2021m6 and 2022m5 that do contain another value than zero.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str66 Name float(mofd returns1)
    ""                                               .            .
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 637     .2102426
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 638   .022271715
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 639   .074074075
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 640  -.019607844
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 641    .57586205
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 642     .1536105
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 643     .4332322
    removed some data bc it doesn't add anything and easier to read my post
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 675   -.10530099
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 676   -.02048604
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 677   -.04203404
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 678   -.09417808
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 679    .07277883
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 680     .3257709
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 681   -.09004818
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 682    -.4036882
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 683   -.09859155
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 684  -.010190218
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 685    .03019904
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 686    .08794137
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 687   -.05848132
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 688   -.03804878
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 689    .11899932
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 690  -.035347432
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 691     -.096461
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 692  -.010051993
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 693    .04586835
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 694   -.07833947
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 695   -.05848166
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 696            0
    removed some data bc it doesn't add anything and easier to read my post
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 723   -.20013388
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 724     .2502092
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 725     .5729585
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 726   -.02468085
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 727   -.19502617
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 728    .05149052
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 729    .04175258
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 730   -.10489856
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 731   .019900497
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 732     .3452575
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 733    .28041902
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 734    .17967275
    "SSH COMMUNICATIONS SECURITY - TOT RETURN IND" 735   .027473994
    end
    format %tm mofd
    So this would be an example of my data. Except here there is no removing needed because there is only one 0 and the zero is also not at the end. Month 2021m6 = 737 en 2022m5 = 748.

    Considering I deleted all the observations at the end containing a zero, it is possible that I for instance have no observation of 2022m5 or 2022m5 and 2022m4. But then I could still have equal or more than 6 observations containing zero (between 2021m6-2022m5).

    I hope you can help me.

    Bibi

  • #2
    Your question appears to contradict itself. Initially, you state that you want to drop all observations of a firm if it has any 12 month period in which there are 6 or more months with zero return. But then you go on in a way that implies that in fact the only 12 month period you care about is from 2021m6 to 2022m5. I'm going to assume you mean the former, since the latter is really quite simple.

    Code:
    gen byte zero_return = (returns1 == 0)
    rangestat (sum) zero_return, by(Name) interval(mofd -11 0)
    by Name (mofd), sort: drop if zero_return_sum >= 6
    -rangestat- is written by Robert Picard, Nick Cox, and Roberto Ferrer. It is available from SSC. Note that I have taken "previous 12 months" to mean that the current month is included. So, for example, the "previous 12 months" from December 2022 would be January 2022 through December 2022. If you would prefer it to be December 2021 through November 2022, change -11 0 to -12 -1 in the -interval()- option.

    (If you are really interested in only 2021m6 through 2022m5, and need assistance with coding that, post back.)

    Comment


    • #3


      Hi Clyde,


      I'm sorry for the unclear post. I was trying to write a reply and then I noticed that you could actually also interprete 'I delete observations of stocks that show non-zero price changes in less than 50% of the traded months in previous 12 months' in the way you did in your post. My supervisor told me the previous 12 months were the last months so indeed 2021m6 and 2022m5, but your explanation makes more sense.


      I ran your code and it works perfectly. Thank you very much.

      Comment

      Working...
      X