Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Value-weighted average of returns

    Hello Statalist,
    I have daily panel data which consist of 10 companies for the period 2004-2014.
    The data as following.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float idc str21 company double _marketvalue float(bcaldate return_tr)
    1 "ALUAR"    4422  0             .
    1 "ALUAR"  4395.6  1    -1.0000079
    1 "ALUAR"  4461.6  2             .
    1 "ALUAR"  4474.8  3      .4938291
    1 "ALUAR"    4554  4      1.951283
    1 "ALUAR" 4593.59  5      .4819319
    1 "ALUAR" 4593.59  6  3.667978e-06
    1 "ALUAR" 4593.59  7  3.667978e-06
    1 "ALUAR" 4580.39  8  3.667978e-06
    1 "ALUAR"  4540.8  9             .
    1 "ALUAR"  4501.2 10             .
    1 "ALUAR" 4593.59 11             .
    1 "ALUAR"    4686 12     2.3754122
    1 "ALUAR"    4686 13 -5.372813e-06
    1 "ALUAR" 4593.59 14             .
    1 "ALUAR" 4593.59 15             .
    1 "ALUAR"  4540.8 16             .
    1 "ALUAR"  4501.2 17             .
    1 "ALUAR"    4554 18             .
    1 "ALUAR"    4620 19      1.438877
    end
    format %tbmybcal bcaldate

    Here, I would like to generate daily market returns which computed as the value-weighted average of the returns of all individual stocks on a given day.

    Before this, I simply use the command;

    code
    Code:
     egen marketreturn = mean(return_tr)
    But, It seems wrong as it did not take into account the Market value as a weight. ​How can I instruct Stata to use market value as a weight? The market value used is at the beginning of each year.

    Thank you in advance for your kind help.

    Regards,
    Rozita



  • #2
    In general, weighted means just require two preparatory steps:

    Code:
    egen num = total(weight * value * !missing(weight, value)), by(group)
    egen den = total(weight * !missing(weight, value)), by(group)
    gen wtmean = num/den
    My names here indicate the quaint terms numerator and denominator used by my teachers about 1960 and evidently still in use.

    If weight and value are never missing, or always missing in precisely the same observations, then you can simplify to

    Code:
    egen num = total(weight * value), by(group)
    egen den = total(weight), by(group)
    gen wtmean = num/den
    The multiplier !missing(weight, value) is 1 when both variables are not missing and 0 otherwise; the effect is to ignore observations in which either is missing (or both).

    A great merit of egen here is that the extension to groupwise means is easy by just adding an option by(group) (which is no longer documented, but works fine) or by using a prefix bysort group:

    If you just want a single weighted mean, it is usually pointless to populate an entire variable with its value, as it can be saved in a scalar:

    Code:
     
    summarize value [aw=weight], meanonly 
    scalar wtmean = r(mean)


    Last edited by Nick Cox; 09 Jun 2016, 03:04.

    Comment


    • #3
      Thank you so much Nick Cox for the solution. I little bit confuse with the term weight and value, does it refer to marketvalue and return_tr respectively as in my case?
      I did try the command given.

      code:
      Code:
       egen num = total(_marketvalue * return_tr * !missing(_marketvalue , return_tr)), by(company)
      Code:
       egen den = total(_marketvalue  * !missing(_marketvalue , return_tr)), by(company)
      Code:
      gen wtmean = num/den
      The following is the output:

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str21 company float bcaldate double _marketvalue float(return_tr num den wtmean)
      "ALUAR"  0    4422             . 6192679 8346243 .741972
      "ALUAR"  1  4395.6    -1.0000079 6192679 8346243 .741972
      "ALUAR"  2  4461.6             . 6192679 8346243 .741972
      "ALUAR"  3  4474.8      .4938291 6192679 8346243 .741972
      "ALUAR"  4    4554      1.951283 6192679 8346243 .741972"ALUAR"  5 4593.59      .4819319 6192679 8346243 .741972
      "ALUAR"  6 4593.59  3.667978e-06 6192679 8346243 .741972
      "ALUAR"  7 4593.59  3.667978e-06 6192679 8346243 .741972
      "ALUAR"  8 4580.39  3.667978e-06 6192679 8346243 .741972
      "ALUAR"  9  4540.8             . 6192679 8346243 .741972
      "ALUAR" 10  4501.2             . 6192679 8346243 .741972
      "ALUAR" 11 4593.59             . 6192679 8346243 .741972
      "ALUAR" 12    4686     2.3754122 6192679 8346243 .741972
      "ALUAR" 13    4686 -5.372813e-06 6192679 8346243 .741972
      "ALUAR" 14 4593.59             . 6192679 8346243 .741972
      "ALUAR" 15 4593.59             . 6192679 8346243 .741972
      "ALUAR" 16  4540.8             . 6192679 8346243 .741972
      "ALUAR" 17  4501.2             . 6192679 8346243 .741972
      "ALUAR" 18    4554             . 6192679 8346243 .741972
      "ALUAR" 19    4620      1.438877 6192679 8346243 .741972
      end
      format %tbmybcal bcaldate
      Thank you again.

      Comment


      • #4
        I didn't address the specifics of your case. I should have said that I didn't understand them.

        For example, you want value at the beginning of each year, so that presumably means that your calculations are minimally by company and year.

        But you don't give a year variable in your example dataset and you don't so far as I can see explain the relationship of bcaldate to years, whether calendar years or defined otherwise.

        So your question seems to pivot on identifying years from your date variable. If you can explain how to do that, I can make more specific suggestions.

        Comment


        • #5
          Thanks Nick Cox for the idea and sorry for late response. I am really stuck with the problem how to relate the caldate to years. What I try do is using the day of year command:

          Code:
          gen dayofyear2 = doy(bcaldate)
          However, when I try to generate new variable _market value at the beginning of the year, the result is weird. The command use is:

          Code:
           bysort company dayofyear : g begofyearMV2=_marketvalue[_n]

          If you have any idea how to solve the problem, please feel free to comment.

          Thank you.

          Comment


          • #6
            You don't give a dataset example, but that command is no more than

            Code:
              
            g begofyearMV2 = _marketvalue
            What you may want to do is something more like

            Code:
                
            bysort company year (dayofyear) : g begofyearMV2= _marketvalue[1]
            which is quite different. If you don't have a year variable, you'll need to create one.

            Comment


            • #7
              Dear Nick Cox,
              I think I got what you have shown by using this command;


              Code:
              gen dayofyear = doy(caldate)
              Code:
              bysort company year (dayofyear) : g begofyearMV= _marketvalue[1]

              Following the command above, the example of dataset is as following:


              Code:
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input float idc str21 company float caldate double _marketvalue float(return yearly dayofyear begofyearMV)
              1 "ALUAR" 16803    4422             . 2006  2 4422
              1 "ALUAR" 16804  4395.6    -1.0000079 2006  3 4422
              1 "ALUAR" 16805  4461.6             . 2006  4 4422
              1 "ALUAR" 16806  4474.8      .4938291 2006  5 4422
              1 "ALUAR" 16807    4554      1.951283 2006  6 4422
              1 "ALUAR" 16810 4593.59      .4819319 2006  9 4422
              1 "ALUAR" 16811 4593.59  3.667978e-06 2006 10 4422
              1 "ALUAR" 16812 4593.59  3.667978e-06 2006 11 4422
              1 "ALUAR" 16813 4580.39  3.667978e-06 2006 12 4422
              1 "ALUAR" 16814  4540.8             . 2006 13 4422
              1 "ALUAR" 16817  4501.2             . 2006 16 4422
              1 "ALUAR" 16818 4593.59             . 2006 17 4422
              1 "ALUAR" 16819    4686     2.3754122 2006 18 4422
              1 "ALUAR" 16820    4686 -5.372813e-06 2006 19 4422
              1 "ALUAR" 16821 4593.59             . 2006 20 4422
              1 "ALUAR" 16824 4593.59             . 2006 23 4422
              1 "ALUAR" 16825  4540.8             . 2006 24 4422
              1 "ALUAR" 16826  4501.2             . 2006 25 4422
              1 "ALUAR" 16827    4554             . 2006 26 4422
              1 "ALUAR" 16828    4620      1.438877 2006 27 4422
              end
              format %td caldate
              format %ty yearly
              The data for market value at the beginning of the year seem correct for each company and each year.

              With regard to the original post, I would like to generate daily market returns which computed using value-weighted average of returns of all individual company on a given day.
              Market value will be use as a weight for the return.

              Thank you.


              Comment

              Working...
              X