Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculate returns based on average purchase price

    Hello,

    I have a dataset of individual trading data representing investors buying and selling behavior. My dataset consists of investor identifier, stock identifier (cusip), date of purchase or sale of stocks, amount of stocks purchased or sold (negative sign indicates sale of stock) and the price of the stock. Now I want to calculate the return an investor generates with each stock. I also want to calculate unrealized returns, therefore each time a stock is sold other stocks of the investor that are currently in his portfolio should be evaluated on the date of sale using closing prices from CRSP. These "unrealized stocks" are the stocks where the variable "stock" == . . An unrealized return is a return the investor would have generated if he had sold the stock at that particular date.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(investor cusip) str9 dates float(stock price date)
    1 1234 "19910108"  100  10 11330
    1 4321 "19910203"   50  25 11356
    1 1234 "19910505" -100  12 11447
    1 4321 "19910505"    .  30 11447
    2 5678 "19920407"  200  75 11785
    2 5678 "19920409"   50  72 11787
    2 5678 "19920802" -250  50 11902
    3 1234 "19921105"  100 100 11997
    3 1234 "19921111"  100  85 12003
    3 1234 "19921212" -150 110 12034
    3 1234 "19921220"  -50 150 12042
    4 8910 "19930218"  100 500 12102
    4 7891 "19930509"   25  10 12182
    4 7891 "19930606"   50   8 12210
    4 8910 "19931006"  -75 510 12332
    4 7891 "19931006"    .  15 12332
    4 7891 "19931020"   30  11 12346
    4 8910 "19931020"    . 525 12346
    4 8910 "19931130"  -25 530 12387
    4 7891 "19931130"    .  16 12387
    5 4657 "19940506"   40 260 12544
    5 8585 "19940512"   50  25 12550
    5 4657 "19940513"   20 250 12551
    5 8585 "19940601"  -20  15 12570
    5 4657 "19940601"    . 251 12570
    5 8585 "19940605"   30  20 12574
    5 4657 "19940707"  -50 300 12606
    5 8585 "19940707"    .  26 12606
    5 8585 "19940725"   50  30 12624
    5 4567 "19940725"    . 330 12624
    6 1234 "19950506"  500 100 12909
    6 1234 "19950606"  250 110 12940
    6 1234 "19950607" -750 150 12941
    6 1234 "19950707"  100 110 12971
    6 1234 "19950709" -100 115 12973
    7 4567 "19950111"    5  10 12794
    7 4567 "19950115"   10   8 12798
    7 4567 "19950120"   -7   9 12803
    7 4567 "19950201"    8  15 12815
    7 4567 "19950209"  -16  16 12823
    end
    format %td date
    I want to calculate returns. Therefore I need the buy price and the sell price and then I calculate (sell price/buy price)-1. But the buy price should be the weighted average purchase price in case of multiple purchases before the sale of the stock. For example if we look at investor 2: He bought 200x of stock "5678" and then on another day 50x stocks of "5678" and then he sells his whole position of 250 stocks. So to calculate the return I want to calculate the average purchase price of the two previous purchases. The same calculation should apply for unrealized returns.
    There are also special cases, e.g. investor 6: He made two purchases before selling his whole position (750 stocks), then he buys from the same stock 100 pieces and sells the 100 stocks again. In the case of the sale of the 100 stocks I want that the average purchase price only considers the 100 stocks purchased before and ignores the 750 stocks purchased in the past, because the 750 stocks were already completely sold and therefore the relevant purchase price is 110$.
    A more special case is with investor 7: This case is quite similar to the one of investor 6 but now he did not sell his complete portfolio (complete portfolio consists of 15 stocks) but only sells 7, calculating the average purchase price here is straightforward, it is (5/15)*10 + (10/15)*8. But then the investor purchases additional 8 stocks, so that his current holding consists of 16 stocks (8 stocks + 8 stocks) and then he sells his whole position. To now determine the average purchase, I want the the remaining 8 stocks from his previous purchase have the average purchase price previously calculated and the newly purchased 8 stocks have a price of 15$. And with this information the new average purchase price should be calculated namely: (8/16)*8.7 + (8/16)*15.

    In think I need a new variable which keeps track of the current holding in each stock and a variable showing the average purchase price.

    I already have a code generating the current holding.
    Code:
    bysort investor cusip date: gen holding = sum(stock[_n])
    Additionally, I do not want to collapse my dataset instead the structure of the dataset should not change. I would like to have a return variable which only shows values in the case of a sale.
    But I do not know how to actually achieve my goals. Therefore, I would be grateful if someone could help me.

    Thanks in advance!
    Last edited by Jana He; 27 Nov 2022, 04:20.
Working...
X