Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Geometric means for longitudinal data

    I have a longitudinal data set of pregnant women nested within trimesters. We measured biomarkers of toxicant exposure, for example toluene (bzma_im below). I examined the means using xtsum without any issues. A colleague recommended that I examine geometric means instead. I see there is a command, ameans. Is there an equivalent for longitudinal data? I have not been able to find one. Thank you.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long pid float(trimester bzma_im)
    7511014 1       .
    7511014 2 23.1728
    7511014 3 22.2623
    7511015 1       .
    7511015 2 24.3935
    7511015 3 35.6095
    7511016 1  3.4852
    7511017 1       .
    7511017 2  4.6753
    7511017 3  3.2963
    7511018 1       .
    7511018 2       .
    7511018 3   12.88
    7511019 1       .
    end

  • #2
    The geometric mean of 1, 2, 3 and 4 is \((1\times2\times3\times4)^{\frac{1}{4}}= 24^{\frac{1}{4}}\approx 2.213\). As we cannot easily multiply over observations but can sum, a trick is to take the log of these values, calculate the mean and then take the exponent.

    $$e^{\frac{1}{4}(\log(1) + \log(2) + \log(3) + \log(4))} \approx 2.213$$

    Therefore:

    Code:
    gen logbzma_im = log(bzma_im)
    bys pid: egen sum=total(logbzma_im)
    bys pid: egen count= total(!missing(logbzma_im))
    gen wanted= exp(sum/count)
    Last edited by Andrew Musau; 18 Dec 2023, 13:30.

    Comment


    • #3
      Andrew Musau 's method is what I would use too: witness the gmean() function for egen from 1999 in egenmore from SSC. You need to install that package, or at least copy _ggmean.ado, before you can use it.

      The route from first principles is generally preferable. It's the same recipe: exp(ave(log())). Here is a demonstration:


      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input long pid float(trimester bzma_im)
      7511014 1       .
      7511014 2 23.1728
      7511014 3 22.2623
      7511015 1       .
      7511015 2 24.3935
      7511015 3 35.6095
      7511016 1  3.4852
      7511017 1       .
      7511017 2  4.6753
      7511017 3  3.2963
      7511018 1       .
      7511018 2       .
      7511018 3   12.88
      7511019 1       .
      end
      
      egen gmean1 = gmean(bzma_im), by(pid)
      
      egen numer = total(log(bzma_im)), by(pid)
      
      egen denom = total(!missing(log(bzma_im))), by(pid)
      
      gen gmean2 = exp(numer/denom)
      
      list, sepby(pid)
      
      
           +-----------------------------------------------------------------------+
           |     pid   trimes~r   bzma_im     gmean1      numer   denom     gmean2 |
           |-----------------------------------------------------------------------|
        1. | 7511014          1         .   22.71299   6.245874       2   22.71299 |
        2. | 7511014          2   23.1728   22.71299   6.245874       2   22.71299 |
        3. | 7511014          3   22.2623   22.71299   6.245874       2   22.71299 |
           |-----------------------------------------------------------------------|
        4. | 7511015          1         .    29.4727   6.766929       2    29.4727 |
        5. | 7511015          2   24.3935    29.4727   6.766929       2    29.4727 |
        6. | 7511015          3   35.6095    29.4727   6.766929       2    29.4727 |
           |-----------------------------------------------------------------------|
        7. | 7511016          1    3.4852     3.4852   1.248525       1     3.4852 |
           |-----------------------------------------------------------------------|
        8. | 7511017          1         .   3.925709   2.735094       2   3.925709 |
        9. | 7511017          2    4.6753   3.925709   2.735094       2   3.925709 |
       10. | 7511017          3    3.2963   3.925709   2.735094       2   3.925709 |
           |-----------------------------------------------------------------------|
       11. | 7511018          1         .      12.88   2.555676       1      12.88 |
       12. | 7511018          2         .      12.88   2.555676       1      12.88 |
       13. | 7511018          3     12.88      12.88   2.555676       1      12.88 |
           |-----------------------------------------------------------------------|
       14. | 7511019          1         .          .          0       0          . |
           +-----------------------------------------------------------------------+

      Comment


      • #4
        Thank you very much Andrew and Nick!
        Last edited by Erin Mead-Morse; 19 Dec 2023, 07:21.

        Comment

        Working...
        X