Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Weighted Average

    I have the below daily dataset to work with. I need a weighted average of all rt's for each time period (i.e., by date) weighted by their immediately previous period (lagged) mcap value. Therefore, data of rt's for each stock id is from begin June month onward and data on mcap is from last day of May (one period earlier) onward. The below data example has data on five stocks and actual dataset has data on about 5000 stocks.

    If the same weighted average were to be calculated for monthly rt's then would the code require any modification?

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float stock_id str54 stock int date double mcap float rt
    1 "3M India Ltd."              15491  318.46            .
    1 "3M India Ltd."              15494  310.07  -.026353024
    1 "3M India Ltd."              15495     325    .04813806
    1 "3M India Ltd."              15496  318.35  -.020450607
    1 "3M India Ltd."              15497  320.94   .008138712
    1 "3M India Ltd."              15498  310.97   -.03106353
    1 "3M India Ltd."              15501       .            .
    1 "3M India Ltd."              15502  324.32            .
    1 "3M India Ltd."              15503  321.05  -.010072942
    1 "3M India Ltd."              15504  313.96  -.022105264
    1 "3M India Ltd."              15505   318.8   .015428777
    1 "3M India Ltd."              15508  324.32   .017314488
    1 "3M India Ltd."              15509  332.26    .02448767
    1 "3M India Ltd."              15510  331.76 -.0015256823
    1 "3M India Ltd."              15511  331.19  -.001697793
    1 "3M India Ltd."              15512  319.93  -.034013607
    1 "3M India Ltd."              15515  326.74    .02130282
    1 "3M India Ltd."              15516  325.67 -.0032752973
    1 "3M India Ltd."              15517  318.24   -.02282947
    1 "3M India Ltd."              15518  326.69    .02654867
    1 "3M India Ltd."              15519  321.22  -.016724138
    2 "3P Land Holdings Ltd."      15491       .            .
    2 "3P Land Holdings Ltd."      15494       .            .
    2 "3P Land Holdings Ltd."      15495       .            .
    2 "3P Land Holdings Ltd."      15496       .            .
    2 "3P Land Holdings Ltd."      15497       .            .
    2 "3P Land Holdings Ltd."      15498       .            .
    2 "3P Land Holdings Ltd."      15501       .            .
    2 "3P Land Holdings Ltd."      15502    4.28            .
    2 "3P Land Holdings Ltd."      15503       .            .
    2 "3P Land Holdings Ltd."      15504    4.27            .
    2 "3P Land Holdings Ltd."      15505    3.89    -.0886076
    2 "3P Land Holdings Ltd."      15508    4.27    .09722222
    2 "3P Land Holdings Ltd."      15509    4.68    .09704641
    2 "3P Land Holdings Ltd."      15510    5.15           .1
    2 "3P Land Holdings Ltd."      15511    5.04   -.02097902
    2 "3P Land Holdings Ltd."      15512    4.77   -.05357143
    2 "3P Land Holdings Ltd."      15515    4.95    .03773585
    2 "3P Land Holdings Ltd."      15516    5.31    .07272727
    2 "3P Land Holdings Ltd."      15517    5.83    .09830508
    2 "3P Land Holdings Ltd."      15518    6.12    .04938272
    2 "3P Land Holdings Ltd."      15519    5.98   -.02352941
    3 "63 Moons Technologies Ltd." 15491  167.94            .
    3 "63 Moons Technologies Ltd." 15494  170.37   .014459665
    3 "63 Moons Technologies Ltd." 15495  166.41  -.023255814
    3 "63 Moons Technologies Ltd." 15496  174.71     .0499232
    3 "63 Moons Technologies Ltd." 15497  162.32    -.0709583
    3 "63 Moons Technologies Ltd." 15498  157.97  -.026771653
    3 "63 Moons Technologies Ltd." 15501  159.38   .008899677
    3 "63 Moons Technologies Ltd." 15502  159.25 -.0008019246
    3 "63 Moons Technologies Ltd." 15503  158.74  -.003210273
    3 "63 Moons Technologies Ltd." 15504  158.48  -.001610306
    3 "63 Moons Technologies Ltd." 15505  153.88   -.02903226
    3 "63 Moons Technologies Ltd." 15508  169.22    .09966777
    3 "63 Moons Technologies Ltd." 15509  166.66   -.01510574
    3 "63 Moons Technologies Ltd." 15510  160.53  -.036809817
    3 "63 Moons Technologies Ltd." 15511  159.76   -.00477707
    3 "63 Moons Technologies Ltd." 15512  156.31       -.0216
    3 "63 Moons Technologies Ltd." 15515  159.12   .017988553
    3 "63 Moons Technologies Ltd." 15516  156.82   -.01445783
    3 "63 Moons Technologies Ltd." 15517  143.27   -.08638957
    3 "63 Moons Technologies Ltd." 15518  144.81   .010704728
    3 "63 Moons Technologies Ltd." 15519  143.91  -.006178288
    4 "A B B India Ltd."           15491 1042.38            .
    4 "A B B India Ltd."           15494 1040.47   -.00182964
    4 "A B B India Ltd."           15495 1042.38   .001832994
    4 "A B B India Ltd."           15496  1046.4   .003862574
    4 "A B B India Ltd."           15497 1036.23  -.009720535
    4 "A B B India Ltd."           15498 1034.11 -.0020449897
    4 "A B B India Ltd."           15501 1038.35  .0040983604
    4 "A B B India Ltd."           15502 1102.77    .06204082
    4 "A B B India Ltd."           15503 1102.77            0
    4 "A B B India Ltd."           15504 1072.47   -.02747886
    4 "A B B India Ltd."           15505 1112.52     .0373444
    4 "A B B India Ltd."           15508 1118.88   .005714286
    4 "A B B India Ltd."           15509 1117.18 -.0015151515
    4 "A B B India Ltd."           15510 1076.28    -.0366085
    4 "A B B India Ltd."           15511 1098.32    .02047647
    4 "A B B India Ltd."           15512 1089.84  -.007717538
    4 "A B B India Ltd."           15515  1087.3 -.0023332685
    4 "A B B India Ltd."           15516  1078.4  -.008185538
    4 "A B B India Ltd."           15517 1059.54  -.017488701
    4 "A B B India Ltd."           15518 1069.29        .0092
    4 "A B B India Ltd."           15519 1139.01    .06520016
    5 "A B C India Ltd."           15491    2.15            .
    5 "A B C India Ltd."           15494       .            .
    5 "A B C India Ltd."           15495    2.03            .
    5 "A B C India Ltd."           15496       2   -.01234568
    5 "A B C India Ltd."           15497    2.25         .125
    5 "A B C India Ltd."           15498     1.9   -.15555556
    5 "A B C India Ltd."           15501       .            .
    5 "A B C India Ltd."           15502    2.05            .
    5 "A B C India Ltd."           15503       .            .
    5 "A B C India Ltd."           15504     2.2            .
    5 "A B C India Ltd."           15505    2.03   -.07954545
    5 "A B C India Ltd."           15508    2.23    .09876543
    5 "A B C India Ltd."           15509    2.38    .06741573
    5 "A B C India Ltd."           15510    2.55    .07368421
    5 "A B C India Ltd."           15511    2.48  -.029411765
    5 "A B C India Ltd."           15512    2.48            0
    end
    format %td date

  • #2
    Code:
    //  MARK OUT FISCAL YEARS
    gen int fyear = year(date) + (month(date) >= 6)
    
    //  SET WEIGHT TO MCAP FROM LAST DAY OF PRECEDING FISCAL YEAR
    isid stock_id date, sort
    rangestat (last) weight = mcap, by(stock_id) interval(fyear -1 -1)
    
    //  CALCULATE WEIGHTED AVERAGES FOR EACH DATE
    by date, sort: egen numerator = total(weight*rt)
    by date: egen denominator = total(weight)
    gen weighted_rt = numerator/denominator
    Note: Stock id 2 has no mcap reported for the end of May 2002, and so does not contribute to the calculations of weighted means.

    If the variable date were a monthly date variable instead the changes would, I think, be as follows:

    Code:
    //  MARK OUT FISCAL YEARS
    gen int fyear = year(dofm(date)) + (month(dofm(date)) >= 6)
    The rest of the code would be unchanged.




    Comment


    • #3
      Note: Stock id 2 has no mcap reported for the end of May 2002, and so does not contribute to the calculations of weighted means.
      I feel #2 doesn't do the desired. May be i did not explain well. Actually the weighted return for each period should follow the below methodology:- Hence, as per this method, for each time period, i.e. day, there has to be one average return observation (weighted average) for all the stocks. Please note, in case of monthly calculation, I will use monthly data instead of daily.

      Attached Files
      Last edited by Sartaj Hussain; 25 Sep 2022, 01:00.

      Comment


      • #4
        Does this serve your purpose?
        Code:
        xtset stock_id date
        gen l_mcap = L.mcap
        egen numerator = total(rt*l_mcap), by(date) missing
        egen denominator = total(l_mcap), by(date) missing
        gen wt_return = numerator/denominator
        drop numerator denominator l_mcap
        The average return by date is:
        Code:
        duplicates drop date, force
        keep date wt_return
        format %5.4f wt_return
        
        list, sep(0) noobs
        
          +----------------------+
          |      date   wt_ret~n |
          |----------------------|
          | 31may2002          . |
          | 03jun2002          . |
          | 04jun2002     0.0085 |
          | 05jun2002     0.0037 |
          | 06jun2002    -0.0128 |
          | 07jun2002    -0.0110 |
          | 10jun2002          . |
          | 11jun2002     0.0537 |
          | 12jun2002    -0.0024 |
          | 13jun2002    -0.0238 |
          | 14jun2002     0.0256 |
          | 17jun2002          . |
          | 18jun2002     0.0026 |
          | 19jun2002    -0.0289 |
          | 20jun2002     0.0130 |
          | 21jun2002    -0.0147 |
          | 24jun2002          . |
          | 25jun2002    -0.0075 |
          | 26jun2002    -0.0251 |
          | 27jun2002     0.0131 |
          | 28jun2002     0.0409 |
          +----------------------+
        Last edited by Hemanshu Kumar; 25 Sep 2022, 04:37.

        Comment


        • #5
          No, #4 is not adequate. One basic issue it has is that It ignores first observation of mcap for lags, i.e. mcap of last day of May. And that is why there is no wt_return observation for first day of June month.

          I basically require rwt as average of all stock returns by (date), literally one observation per date. For the formula given in #3, i want to add below post script.
          Click image for larger version

Name:	formulat.PNG
Views:	1
Size:	9.7 KB
ID:	1683238

          Last edited by Sartaj Hussain; 25 Sep 2022, 05:02.

          Comment


          • #6
            #4 does produce the same weighted return across all stock observations on a date. You can then just keep one observation per date in your dataset, if you like. The code for that has been added to #4 in edit.

            Comment


            • #7
              One basic issue it has is that It ignores first observation of mcap for lags, i.e. mcap of last day of May. And that is why there is no wt_return observation for first day of June month.
              This issue is still not addressed. The actual data can begin from any day of any month and that is why one observation of May is kept in data example.
              Last edited by Sartaj Hussain; 25 Sep 2022, 05:01.

              Comment


              • #8
                No, it is not ignoring the first observation. You don't have any observation for June 1 in your data example. If you did, the weighted return for that date would incorporate the mcap for one day prior, i.e. the last day of May.

                Comment


                • #9
                  Okay, I think I understand the issue. I looked it up, and May 31, 2002 was a Friday, and the next date in your dataset is a Monday. So then you don't want to use the lag operator from my code in #4, you can do it using the following code:

                  Code:
                  bysort stock_id (date): gen l_mcap = mcap[_n-1]
                  egen numerator = total(rt*l_mcap), by(date) missing
                  egen denominator = total(l_mcap), by(date) missing
                  gen wt_return = numerator/denominator
                  drop numerator denominator l_mcap
                  
                  duplicates drop date, force
                  keep date wt_return
                  format %5.4f wt_return
                  
                  list, sep(0) noobs
                  which produces:
                  Code:
                    +----------------------+
                    |      date   wt_ret~n |
                    |----------------------|
                    | 31may2002          . |
                    | 03jun2002    -0.0051 |
                    | 04jun2002     0.0085 |
                    | 05jun2002     0.0037 |
                    | 06jun2002    -0.0128 |
                    | 07jun2002    -0.0110 |
                    | 10jun2002     0.0038 |
                    | 11jun2002     0.0537 |
                    | 12jun2002    -0.0024 |
                    | 13jun2002    -0.0238 |
                    | 14jun2002     0.0256 |
                    | 17jun2002     0.0175 |
                    | 18jun2002     0.0026 |
                    | 19jun2002    -0.0289 |
                    | 20jun2002     0.0130 |
                    | 21jun2002    -0.0147 |
                    | 24jun2002     0.0046 |
                    | 25jun2002    -0.0075 |
                    | 26jun2002    -0.0251 |
                    | 27jun2002     0.0131 |
                    | 28jun2002     0.0409 |
                    +----------------------+
                  This might suffice for your data, but it could potentially produce wrong results if you have observations missing for dates when the market was actually open. If this is the case, you will need to create a business calendar, and then use the lag operator. See:

                  Code:
                  help datetime business calendars

                  Comment


                  • #10
                    In this dataset, it will have observations for all dates when market was open, since each date represents a cross-section of rt's of stocks. But on a given date, there may be some cases where rt or mcap or both of these is missing. I just want the code to treat such observations as missing (not part of calculation of weighted rt on that date). Hope that is amply clear.

                    Comment


                    • #11
                      As long as the observation exists, even if those specific variables are missing, the code in #9 should work fine.

                      Comment


                      • #12
                        Sartaj Hussain You are asking the impossible. The problem is not the code, it's the data. In the example data for stock_id #2 there is no value of mcap given for 31 May 2002: there is only a missing value there. Nor could one substitute some other date from May 2002, because there are other observations at all in that month for that stock_id. In order to include stock_id #2 in the calculations, there has to be a value of mcap for 31 May 2002. There is no way to code around that. Where there is no data, there can be no result.

                        Comment


                        • #13
                          Clyde Schechter Please go through #5 which i posted today. The code has to do calculations for each day. In case for any stock, if the value of rt or mcap or both is missing, then it can ignore that stock for that day. But as a whole for each time period (day), other daily rt's using the previous period mcap as weight (provided either rt, mcap or both is not missing) have to aggregate into a single daily weighted average measure.
                          Last edited by Sartaj Hussain; 25 Sep 2022, 10:36.

                          Comment


                          • #14
                            Sartaj Hussain have you tried the code provided in #9? if so, can you explain why that doesn't solve your issue?

                            Comment


                            • #15
                              I tried it but as a word of clarification. These two lines in the code create values even when mcap or rt is missing

                              Code:
                               
                               egen numerator = total(rt*l_mcap), by(date) missing egen denominator = total(l_mcap), by(date) missing

                              Comment

                              Working...
                              X