Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating Weighted Means for parts of a Variable

    Dear Members,

    I struggle to find a sufficient way to calculate the weighted means for only parts of the same variable. My data looks like this:

    Code:
    clear
    input float(Date S rankR pp1)
    444   956.07 5          .
    445   341.13 1 -1.7071257
    445 19155.85 1 -1.5323987
    445  1114.85 1 -1.1427858
    445      437 1 -1.1130054
    445  1803.29 1 -1.0994228
    445  1616.67 2          .
    445   697.13 2          .
    445   646.48 2          .
    445  1777.67 2          .
    445   520.75 2          .
    445 10831.29 2          .
    445  1388.53 2          .
    445  2363.95 2          .
    445   567.16 2          .
    445   432.39 2          .
    445   348.06 2          .
    445  2076.89 2          .
    445  1620.49 3          .
    445  3048.57 3          .
    445   628.47 3          .
    445  2256.79 3          .
    445   484.28 3          .
    445   329.66 3          .
    445   741.08 3          .
    445    920.9 3          .
    445    492.1 3          .
    445   943.97 3          .
    445  2324.54 3          .
    445  1894.79 3          .
    445   475.74 3          .
    445   759.88 4          .
    445  1865.04 4          .
    445   700.96 4          .
    445   182.07 4          .
    445    699.3 4          .
    445  2344.02 4          .
    445  1140.33 4          .
    445  2937.88 4          .
    445   502.82 4          .
    445  1671.26 4          .
    445   647.73 4          .
    445   769.02 4          .
    445   286.54 4          .
    445   551.57 5          .
    445  4136.22 5          .
    445   148.34 5          .
    445  1296.76 5          .
    445  1110.87 5          .
    446 21517.73 1 -1.6654475
    446   356.26 1 -1.6385354
    446  1867.13 1 -1.0254856
    446  1189.24 1  -.9913489
    446    435.8 1  -.9444581
    446   652.46 2          .
    446  1669.61 2          .
    446   786.98 2          .
    446  1766.99 2          .
    446 11121.96 2          .
    446   542.17 2          .
    446   429.53 2          .
    446  2103.92 2          .
    446  2355.87 2          .
    446  3160.89 2          .
    446   346.89 2          .
    446  1712.37 2          .
    446  1422.09 3          .
    446    548.9 3          .
    446   734.96 3          .
    446   490.68 3          .
    446   626.79 3          .
    446   302.95 3          .
    446  2313.82 3          .
    446   505.58 3          .
    446   1043.2 3          .
    446   916.09 3          .
    446  1946.24 3          .
    446  2189.42 3          .
    446   762.12 3          .
    446    466.2 4          .
    446   176.39 4          .
    446   670.61 4          .
    446  1955.53 4          .
    446   702.44 4          .
    446  2824.92 4          .
    446  1752.46 4          .
    446   496.51 4          .
    446  1219.13 4          .
    446  2321.74 4          .
    446   641.98 4          .
    446   304.27 4          .
    446   775.58 4          .
    446   533.61 5          .
    446   162.69 5          .
    446  1198.62 5          .
    446  4058.29 5          .
    446   1012.3 5          .
    447   345.76 1 -1.7221066
    447 18398.61 1  -1.653302
    447   665.19 1 -1.1366626
    end
    format %tm Date


    This is only a short part of my whole date (41000 observations), however I hope you get a sense of how my data looks like.
    What I want to achieve is, that I am able to calculate the weighted mean for each of these 5er Blocks in pp1 (The last one is only 3 because of the limitation of 100 observations). Each of the pp1 obervations should be matched with the corresponding S-Variable (size) and then the average over the 5 should be calculated.
    I tried achieving this via the _gwtmean package, however, here I can only calculate the weighted average mean over all observation, and I am not able to split the observations in these little parts.

    Maybe as a little backround information, these are monthly industry returns, and these 5er blocks are bundled portfolios out of these industries. I want to calculate the mean return for those bundled portfolios and for that i first need the weighted average return for each month so afterwards i can use a simple mean calculation on these weighted means.

    I hope I explained the data as precise as neccessary.

    Kind regards,

    Peter

  • #2
    I'm not entirely sure what you're after. Are you looking for the weighted monthly mean by rankR? If so, you could try
    Code:
    bysort Date rankR: egen totalSize = total(S)
    bysort Date rankR: egen totalReturn = total(S*pp1)
    gen meanReturn = totalReturn/totalSize
    To then calculate the mean monthly return, you could use
    Code:
    bysort Date rankR: replace meanReturn = . if _n > 1
    bysort Date: egen meanMonthlyReturn = mean(meanReturn)
    bysort Date: replace meanMonthlyReturn = . if _n > 1
    If you need a different set of returns, it's just a matter of changing the variables after the -bysort-, which calculates things per unique group defined by the variables that follow it.

    Comment


    • #3
      Thank you Jessi for your answere. However, I think that is not quite what I am looking for, or I just misunderstood what you are doing.
      Let me try to explain my problem a bit more clearly:
      You see at the top of the Data these 5 observations for pp1: -1.707 down to -1.099. What I would like to calculate is the weighted average of these five observations (with S as the weight). After that, I want to perform the same calculation for the next block of five observations in pp1 and so on. However, I do not want to calculate the weighted average over all observations and then split that result into the corresponding months ie. receive a constant mean average for every month. My goal is to receive a individual weighted average for every observation and than to calculate the mean over these individual (5er block) weighted average observations.

      Comment


      • #4
        So the last thing you want is just a single number? In that case, I think you just need to remove the -bysort Date:- bit from the last two lines (5 and 6) and you'd get it, no? Does the first block (lines 1-3) at least perform your intended first step?

        Comment


        • #5
          Indeed they do, and now it works! Thanks very much. I under/overestimated the "power" of bysort

          Comment

          Working...
          X