hi guys,
I am working with the following data:
. dataex
----------------------- copy starting from the next line -----------------------
------------------ copy up to and including the previous line ------------------
Listed 100 out of 2370016 observations
Use the count() option to list more
- it is not super evident from this data example. However I calculate daily returns by
. The change in price.
- this is data from datastream. this source " pads" the variable price, after a company (Name) is dead. thus the last valid price is repeated till the end of your chosen time period.
- therefore I want do to the following: Delete all zero values for return from the end my sample untill the first non-zero return.
- From reading up on some different statalist posts I thought I would try the following:
--> this drops 19500 observations. whilst when I go through it manually, for some companies( Name) it seems not be applied optimally. The zero`s in between valid datapoint are kept however that is pleasing to see.
Where am I making a error in my thought process?
I am working with the following data:
. dataex
----------------------- copy starting from the next line -----------------------
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str6 Code_Lefted float date str48 Name double prices float return "130298" 13485 "ALBEMARLE" 9.25 . "130298" 13487 "ALBEMARLE" 9.375 .013513514 "130298" 13489 "ALBEMARLE" 9.125 -.02666667 "130298" 13493 "ALBEMARLE" 9.5 .04109589 "130298" 13494 "ALBEMARLE" 9.5625 .006578947 "130298" 13495 "ALBEMARLE" 9.4375 -.013071896 "130298" 13501 "ALBEMARLE" 8.3125 -.1192053 "130298" 13502 "ALBEMARLE" 9.125 .09774436 "130298" 13503 "ALBEMARLE" 9.0625 -.006849315 "130298" 13506 "ALBEMARLE" 8.9375 -.013793103 "130298" 13508 "ALBEMARLE" 8.6875 -.02797203 "130298" 13510 "ALBEMARLE" 8.875 .021582734 "130298" 13514 "ALBEMARLE" 9.0625 .02112676 "130298" 13516 "ALBEMARLE" 9.5625 .05517241 "130298" 13520 "ALBEMARLE" 9.5625 0 "130298" 13523 "ALBEMARLE" 9.4375 -.013071896 "130298" 13524 "ALBEMARLE" 9.5625 .013245033 "130298" 13527 "ALBEMARLE" 9.5 -.006535948 "130298" 13531 "ALBEMARLE" 9.375 -.013157895 "130298" 13538 "ALBEMARLE" 9.5625 .02 "130298" 13541 "ALBEMARLE" 9.3125 -.02614379 "130298" 13578 "ALBEMARLE" 9.8125 .05369128 "130298" 13583 "ALBEMARLE" 9.75 -.006369427 "130298" 13584 "ALBEMARLE" 9.875 .012820513 "130298" 13586 "ALBEMARLE" 10 .012658228 "130298" 13592 "ALBEMARLE" 9.4375 -.05625 "130298" 13593 "ALBEMARLE" 9.25 -.01986755 "130298" 13597 "ALBEMARLE" 9.375 .013513514 "130298" 13604 "ALBEMARLE" 8.9375 -.04666667 "130298" 13606 "ALBEMARLE" 8.9375 0 "130298" 13608 "ALBEMARLE" 9 .006993007 "130298" 13611 "ALBEMARLE" 9.125 .01388889 "130298" 13613 "ALBEMARLE" 9.25 .01369863 "130298" 13614 "ALBEMARLE" 9.25 0 "130298" 13615 "ALBEMARLE" 9.25 0 "130298" 13618 "ALBEMARLE" 9.5 .02702703 "130298" 13620 "ALBEMARLE" 9.375 -.013157895 "130298" 13621 "ALBEMARLE" 9 -.04 "130298" 13622 "ALBEMARLE" 9 0 "130298" 13625 "ALBEMARLE" 8.75 -.02777778 "130298" 13628 "ALBEMARLE" 9 .02857143 "130298" 13634 "ALBEMARLE" 8.8125 -.020833334 "130298" 13635 "ALBEMARLE" 8.875 .007092198 "130298" 13636 "ALBEMARLE" 9.3125 .04929578 "130298" 13639 "ALBEMARLE" 9.3125 0 "130298" 13640 "ALBEMARLE" 9.25 -.00671141 "130298" 13643 "ALBEMARLE" 9.25 0 "130298" 13646 "ALBEMARLE" 9.3125 .006756757 "130298" 13647 "ALBEMARLE" 9.3125 0 "130298" 13648 "ALBEMARLE" 9.25 -.00671141 "130298" 13653 "ALBEMARLE" 9.1875 -.006756757 "130298" 13660 "ALBEMARLE" 9.25 .006802721 "130298" 13662 "ALBEMARLE" 9.25 0 "130298" 13664 "ALBEMARLE" 9.375 .013513514 "130298" 13669 "ALBEMARLE" 9.125 -.02666667 "130298" 13671 "ALBEMARLE" 9.3125 .020547945 "130298" 13674 "ALBEMARLE" 9.3125 0 "130298" 13675 "ALBEMARLE" 9.3125 0 "130298" 13676 "ALBEMARLE" 9.4375 .01342282 "130298" 13677 "ALBEMARLE" 9.8125 .0397351 "130298" 13678 "ALBEMARLE" 9.9375 .012738854 "130298" 13682 "ALBEMARLE" 10.1875 .02515723 "130298" 13683 "ALBEMARLE" 10.125 -.006134969 "130298" 13684 "ALBEMARLE" 9.875 -.02469136 "130298" 13688 "ALBEMARLE" 10.0625 .01898734 "130298" 13690 "ALBEMARLE" 9.625 -.04347826 "130298" 13691 "ALBEMARLE" 10 .03896104 "130298" 13695 "ALBEMARLE" 10.5312 .05312 "130298" 13696 "ALBEMARLE" 11 .04451535 "130298" 13697 "ALBEMARLE" 11.375 .03409091 "130298" 13698 "ALBEMARLE" 11.4375 .005494506 "130298" 13704 "ALBEMARLE" 10.9375 -.04371585 "130298" 13705 "ALBEMARLE" 11.0625 .01142857 "130298" 13706 "ALBEMARLE" 11.0625 0 "130298" 13711 "ALBEMARLE" 11.4687 .036718644 "130298" 13717 "ALBEMARLE" 11.1562 -.027248075 "130298" 13718 "ALBEMARLE" 11.3437 .016806798 "130298" 13720 "ALBEMARLE" 11.125 -.019279424 "130298" 13723 "ALBEMARLE" 11.2187 .008422472 "130298" 13724 "ALBEMARLE" 11.5 .025074206 "130298" 13726 "ALBEMARLE" 12 .04347826 "130298" 13727 "ALBEMARLE" 12.0312 .0026 "130298" 13730 "ALBEMARLE" 12.125 .007796396 "130298" 13731 "ALBEMARLE" 11.9062 -.01804536 "130298" 13732 "ALBEMARLE" 11.9687 .005249366 "130298" 13734 "ALBEMARLE" 11.875 -.007828753 "130298" 13737 "ALBEMARLE" 11.9062 .002627368 "130298" 13738 "ALBEMARLE" 11.7812 -.010498731 "130298" 13739 "ALBEMARLE" 11.9062 .010610125 "130298" 13741 "ALBEMARLE" 11.7187 -.015748098 "130298" 13746 "ALBEMARLE" 11.6875 -.0026624114 "130298" 13751 "ALBEMARLE" 11.5937 -.008025669 "130298" 13752 "ALBEMARLE" 11.6875 .008090601 "130298" 13753 "ALBEMARLE" 11.75 .005347594 "130298" 13755 "ALBEMARLE" 12.2812 .04520851 "130298" 13759 "ALBEMARLE" 12.3125 .002548611 "130298" 13760 "ALBEMARLE" 13 .05583756 "130298" 13761 "ALBEMARLE" 12.7187 -.02163846 "130298" 13765 "ALBEMARLE" 12.9062 .014742073 "130298" 13769 "ALBEMARLE" 13.5 .0460089 end format %tdDDNNCCYY date
Listed 100 out of 2370016 observations
Use the count() option to list more
- it is not super evident from this data example. However I calculate daily returns by
Code:
gen return = (price-price[_n-1]) / price[_n-1]
- this is data from datastream. this source " pads" the variable price, after a company (Name) is dead. thus the last valid price is repeated till the end of your chosen time period.
- therefore I want do to the following: Delete all zero values for return from the end my sample untill the first non-zero return.
- From reading up on some different statalist posts I thought I would try the following:
Code:
gen last=.
Code:
bys Name (date): replace last=cond(return>0 & return<. & last[_n-1]==.,_n,last[_n-1])
Code:
drop if last==.
Where am I making a error in my thought process?
Comment