Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Weights in pctile

    I was looking to understand how Stata computes weighted percentiles. I tried to follow the formula in the documentation (here: https://www.stata.com/manuals13/dpctile.pdf) but I get different results:

    Code:
    . set seed 1729
    
    . clear
    
    . set obs 100000
    number of observations (_N) was 0, now 100,000
    
    . gen w = 100 * runiform()
    
    . gen long x = _n
    
    . sort x w
    
    . gen wsum = sum(w)
    
    . gen P = wsum[_N] * 1 / 100
    
    . assert abs(wsum - P) > 0.1
    
    .
    . gen gt = (wsum > P)
    
    . qui sum x if gt
    
    . disp `r(min)'
    997
    
    .
    . _pctile x [aw = w], p(1)
    
    . return list
    
    scalars:
                     r(r1) =  996.5
    I'm not entirely sure why it gives that answer. It seems incorrect:

    Code:
    . format %21.5f wsum P
    
    . disp P
    49850.559
    
    . qui count if wsum < P
    
    . list in `=r(N) - 1' / `=r(N) + 1'
    
         +-------------------------------------------------+
         |        w     x          wsum             P   gt |
         |-------------------------------------------------|
    995. |  22.8744   995   49817.42578   49850.55859    0 |
    996. | 32.89221   996   49850.32031   49850.55859    0 |
    997. | 25.59445   997   49875.91406   49850.55859    1 |
         +-------------------------------------------------+
    This shows that the first x such that the cumulative weighted sum is is larger than P is 997; however, Stata seems to think that this requires averaging with the prior observation, which should only happen if 49850.32031 is equal to 49850.55859.

    I thought this might be a numerical precision issue, but I get the same result using doubles (the cum sum is 49850.31956 and P is 49850.56176). It is interesting to note that collapse gives what I understand to be the right answer:

    Code:
    . collapse (p1) x [aw = w]
    
    . disp x
    997
    Any insights?

Working...
X