Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combining spatial datasets using coordinates

    Dear all,

    I have two datasets in which for every observation there are two variables, lat and long, providing the geographic coordinates to that observation. Now, in dataset A, I want to create two dummy varibales that for every observation are defined in the following way:
    - they are both zero if there is no obervation in dataset B that lies within 50km spatial distance from the observation in dataset A
    - dummy_1 is 1 if there is an observation in dataset B that lies within 50km spatial distance from the observation in dataset A and variable_1 for the obs. in dataset B is "up"
    - dummy_2 is 1 if there is an observation in dataset B that lies within 50km spatial distance from the observation in dataset A and variable_1 for the obs. in dataset B is "down"
    Variable_1 can have other values than up and down, but I guess I could modify the sample such as for up and down to be the only two possibilities.

    Can anyone help me out on that? I'm a Stata beginnner, so please don't take too much knowledge for granted

    Thank you a lot in advance.

    Best regards,
    Marco
    Last edited by Marco Gallo; 27 May 2017, 05:38.

  • #2
    This is something that's pretty easy to do using geonear (from SSC). To install it, type in Stata's Command window;
    Code:
    ssc install geonear
    Once installed, the steps are something like:
    Code:
    clear
    set seed 123456
    set obs 100
    gen b_id = _n
    gen double b_lat = 37 + (41 - 37) * uniform()
    gen double b_lon = -109 + (109 - 102) * uniform()
    gen direction = cond(mod(_n,2),"up","down")
    save "statalist_B.dta", replace
    
    clear
    set obs 100
    gen a_id = _n
    gen double a_lat = 37 + (41 - 37) * uniform()
    gen double a_lon = -109 + (109 - 102) * uniform()
    
    * find nearest neighbor of each A obs using locations in B
    geonear a_id a_lat a_lon using "statalist_B.dta", n(b_id b_lat b_lon)
    
    * merge other variables from B based on the nearest neighbor id
    rename nid b_id
    merge m:1 b_id using "statalist_B.dta", keep(master match) nogen 
    
    * create the dummies
    gen dummy_1 = km_to_nid < 50 & direction == "up"
    gen dummy_2 = km_to_nid < 50 & direction == "down"

    Comment


    • #3
      Thanks a lot Robert! I don't yet understand all of it, but I'll work my way through it. However, with this the dummies will only react to the nearest neighbour, right? So if the nearest neighbour is "up" and there is another B observation within 50km that is "down", dummy_2 will be 0, right?

      Comment


      • #4
        Yes, that's correct, only the nearest neighbor is considered. If you want to determine each dummy separately, you need to split the neighbor dataset into "up" and "down" datasets and call geonear separately.

        Comment


        • #5
          Okay, perfect, thank you!

          Comment

          Working...
          X