Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Identifying closest observations

    Hello,

    My query is probably very simple but I can't seem to find a way to identify the closest observations.

    Let's say my data is as follows

    v1
    20.59417
    20.63302
    20.73268
    20.73901
    20.74116
    20.80359
    20.92906
    20.98822
    20.99093
    21.20109
    21.24915
    21.31334
    21.32321
    21.3418
    etc.

    For each observation I would like to find the four closest observations (without including the actual observation) and create a mean from them.

    Any help would be appreciated

    Kind Regards,

    Sam

  • #2
    In your example the values are ordered, but let's not assume that. Here is one way to proceed using rangestat (SSC).

    Code:
    sysuse auto, clear 
    
    sort mpg 
    gen id = _n 
    
    rangestat (count) mpg (mean) mpg, interval(id -2 2) excludeself 
    
    di (mpg[1] + mpg[2] + mpg[4] + mpg[5])/4
    
    13
    
    list *mpg* in 1/5 
    
         +----------------------------+
         | mpg   mpg_co~t    mpg_mean |
         |----------------------------|
      1. |  12          2          13 |
      2. |  12          3   13.333333 |
      3. |  14          4          13 |
      4. |  14          4        13.5 |
      5. |  14          4          14 |
         +----------------------------+

    Comment


    • #3
      Thanks Nick. That works perfectly!

      Comment

      Working...
      X