Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Arithmetic, Geometric and Harmonic totals, and Arithmetic, Geometric and Harmonic weighted means

    The Arithmetic_Mean = (Sum Xi)/n, and therefore n*Arithmetic_Mean= Sum Xi. This fact motivates me to define a Weighted_Total = n*Weighted_Arithmetic_Mean, where Weighted_Arithmetic_Mean=(Sum Wi*Xi)/Sum(Wi).

    The Geometric_Mean = (Prod Xi)^(1/n), and therefore Prod Xi = Geometric_Mean^n. This fact motivates me to define a Weighted_Product = n*Weighted_Geometric_Mean, where Weighted_Geometric_Mean = (Prod Xi^Wi)^(1/Sum Wi).

    My first question is do you see anything silly in what I am doing?

    And my second is how can I do the same for the Harmonic Mean? That is, how can we define some interesting Weighted Total based on the Harmonic Mean?

  • #2
    I've been trying to think my response through to a conclusion or two, but with 50 hours since your post, maybe it's time to just share a few incomplete thoughts.

    I believe it would be useful to think about the question of weighted totals keeping the meaning of the weights in mind. Stata nicely distinguishes three types of statistically meaningful weights - frequency weights, sampling weights, and analytic weights - and then tosses in a fourth - importance weights, which they admit has no statistical meaning per se.

    With regard to frequency weights, I think you would not want to define a Weighted_Total as you do, because in fact frequency weights are shorthand for repeated observations recorded a single time, and the true "n" in the dataset is not the number of observations in the dataset, but rather the sum of the weights.

    With regard to analytic weights, we note that no matter how the weights Wi are expressed, the calculations are made with the rescaled value wi, where sum(wi) = n, so thus wi = nWi/sum(Wi). In that case, your Weighted_Total will be identical to sum(wi*Xi)/n. For whatever that is worth, I haven't figured out how to think of it.
    Code:
    . sysuse auto, clear
    (1978 Automobile Data)
    
    . summarize price
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+---------------------------------------------------------
           price |         74    6165.257    2949.496       3291      15906
    
    . 
    . generate w1 = weight // 😀
    
    . generate w2 = weight/2000
    
    . quietly summarize weight
    
    . generate w3 = r(N) * weight / r(mean)
    
    . 
    . summarize price [aw=w1]
    
        Variable |     Obs      Weight        Mean   Std. Dev.       Min        Max
    -------------+-----------------------------------------------------------------
           price |      74      223440    6568.637   3225.219       3291      15906
    
    . summarize price [aw=w2]
    
        Variable |     Obs      Weight        Mean   Std. Dev.       Min        Max
    -------------+-----------------------------------------------------------------
           price |      74      111.72    6568.637   3225.219       3291      15906
    
    . summarize price [aw=w3]
    
        Variable |     Obs      Weight        Mean   Std. Dev.       Min        Max
    -------------+-----------------------------------------------------------------
           price |      74  5475.99998    6568.637   3225.219       3291      15906
    
    . scalar Weighted_Total = r(N) * r(mean)
    
    . generate pw = price * w3
    
    . quietly summarize pw
    
    . display "Weighted_Total " Weighted_Total " - Mean weighted values " r(mean)
    Weighted_Total 486079.13 - Mean weighted values 486079.13
    So that's as far as I've gotten thinking about the issues involving the particular meaning of the weight being applied.

    We read in Wikipedia that the harmonic mean is typically used for ratios. An example from finance would be the price-earnings ratio.

    Similarly, we read that the geometric mean is typically used for values meant to be cumulated by multiplying them together. An example would be growth rates - measures of proportional growth - ratios taken between values at successive points in time.

    So the idea of a weighted total corresponding to a geometric mean is comprehensible to me - it should in some sense represent the cumulative growth represented by a collection of growth rates.

    I'm not able to fathom what cumulative quantity is represented by the harmonic mean of a group of price-earnings ratio. That is, I can comprehend an average price-earnings ratio, but I cannot comprehend it as some sort of function of a pair of aggregated quantities, one of which would be in some sense a total.

    So 51 hours later (writing this has taken a bit) those are my thoughts about potentially useful approaches to thinking about the problems you've set yourself to - is the "Weighted Total" a sensible concept, and what would be a suitable "Weighted Total" for harmonic means.


    Comment


    • #3
      At some risk of spelling out what everyone knows, the intimate connection between means and totals doesn't necessarily imply that totals are useful, or even sensibly defined, even when means are. So, in climatology we happily average temperatures but total temperatures don't have separate interest, yet some related quantities do, so that cooling degree days and heating degree days are useful sums or if you prefer integrals over daily records of | temperature - threshold temperature |. In economics average income makes sense, even if we might prefer another summary, and total income can make sense, e.g. over members of a family.

      Physical scientists talk about intensive and extensive properties https://en.wikipedia.org/wiki/Intens...ive_properties and my namesake Sir David Cox has often emphasised (or rather I infer from his characteristic level tone) that the distinction is more important than many distinctions between kinds of measurement often prominent in the statistical literature.

      To the point: I have not heard that totals corresponding to geometric or harmonic means were interesting or useful, but that is a zero rather than a negative, and I was rather waiting for someone to show they could be....

      Comment


      • #4
        While "total" is casually used as a synonym for "addition" and "sum", in my dictionary (OED Compact Edition, 1971) that is but one meaning. In writing "weighted total" I was using total in the sense of "the aggregate; the whole sum or amount; a whole" where I note that the OED hints at a distinction between "sum" and "amount".

        I used "weighted total" in that sense, and to provide a consistent term across the different means. Indeed, Joro called the result derived from the weighted mean of a geometric distribution a "weighted product" rather than "weighted total", which he too reserved for the addition from which an arithmetic mean is built.

        For the geometric mean, the product of the individual measures of relative growth yields a measure of the total relative growth represented by the entirety of the individual measures. So in that sense indeed, in the unweighted version at least, there is a "total" (though not a "sum") corresponding to the geometric mean.

        With that said, I'm not claiming any opinion - positive or negative - about the issues Joro is considering. My goal was to point of the considerations of weighting, and my inability to come up with a meaningful concept of total for the harmonic mean, in the hope that he would find these insights new and perhaps helpful.

        Comment


        • #5
          William Lisowski and Nick, thank you very much for your thoughts on this. Of course I agree with Nick that defining totals in the way William and I are doing sometimes makes, and sometimes does not make sense.

          We leave the harmonic total aside as something for which we cannot think of a meaningful total.

          William raises an important point about the distinction between analytic and frequency weights. I will think more about this, but I think there is a simple solution for this: If analytic weights are had in mind, rescale to total by multiplying by N the arithmetic mean or raising to the power of N the geometric mean; if frequency weights are had in mind rescale to total by multiplying by Sum Wi the arithmetic mean or raising to the power of Sum Wi the geometric mean. (I will look again through what William showed for the weighting to make sure that I am not going in logical circles, but this discussion is with a reference to a command that I am programming, so whatever formula I put for the weights this is it, without reference to how Stata customary weights.)

          Last word on the Geometric Total, as William pointed out, cumulative growth rates are one example where this quantity makes sense. Another example are compounded stock returns on a portfolio. Say that we study a year of monthly data. Without weighting here the geometric average is the average stock return per month say, the product is the total annual return. It is interesting in this context to extend the concepts to a weighted geometric mean and weighted total, because for example we might want to weight the return with the share the stock takes in our portfolio.

          Comment

          Working...
          X