Combining spatial datasets using coordinates

Marco Gallo

Join Date: May 2017

Posts: 25
#1

Combining spatial datasets using coordinates

27 May 2017, 05:36

Dear all,

I have two datasets in which for every observation there are two variables, lat and long, providing the geographic coordinates to that observation. Now, in dataset A, I want to create two dummy varibales that for every observation are defined in the following way:
- they are both zero if there is no obervation in dataset B that lies within 50km spatial distance from the observation in dataset A
- dummy_1 is 1 if there is an observation in dataset B that lies within 50km spatial distance from the observation in dataset A and variable_1 for the obs. in dataset B is "up"
- dummy_2 is 1 if there is an observation in dataset B that lies within 50km spatial distance from the observation in dataset A and variable_1 for the obs. in dataset B is "down"
Variable_1 can have other values than up and down, but I guess I could modify the sample such as for up and down to be the only two possibilities.

Can anyone help me out on that? I'm a Stata beginnner, so please don't take too much knowledge for granted

Thank you a lot in advance.

Best regards,
Marco

Last edited by Marco Gallo; 27 May 2017, 05:38.
Tags: data

Robert Picard

Join Date: Mar 2014
Posts: 1536

27 May 2017, 09:15

This is something that's pretty easy to do using geonear (from SSC). To install it, type in Stata's Command window;

Code:

ssc install geonear

Once installed, the steps are something like:

Code:

clear
set seed 123456
set obs 100
gen b_id = _n
gen double b_lat = 37 + (41 - 37) * uniform()
gen double b_lon = -109 + (109 - 102) * uniform()
gen direction = cond(mod(_n,2),"up","down")
save "statalist_B.dta", replace

clear
set obs 100
gen a_id = _n
gen double a_lat = 37 + (41 - 37) * uniform()
gen double a_lon = -109 + (109 - 102) * uniform()

* find nearest neighbor of each A obs using locations in B
geonear a_id a_lat a_lon using "statalist_B.dta", n(b_id b_lat b_lon)

* merge other variables from B based on the nearest neighbor id
rename nid b_id
merge m:1 b_id using "statalist_B.dta", keep(master match) nogen 

* create the dummies
gen dummy_1 = km_to_nid < 50 & direction == "up"
gen dummy_2 = km_to_nid < 50 & direction == "down"

Comment

Marco Gallo

Join Date: May 2017

Posts: 25
#3

27 May 2017, 14:22

Thanks a lot Robert! I don't yet understand all of it, but I'll work my way through it. However, with this the dummies will only react to the nearest neighbour, right? So if the nearest neighbour is "up" and there is another B observation within 50km that is "down", dummy_2 will be 0, right?
Comment
Robert Picard

Join Date: Mar 2014

Posts: 1536
#4

27 May 2017, 14:47

Yes, that's correct, only the nearest neighbor is considered. If you want to determine each dummy separately, you need to split the neighbor dataset into "up" and "down" datasets and call geonear separately.
1 like
Comment
Marco Gallo

Join Date: May 2017

Posts: 25
#5

28 May 2017, 03:40

Okay, perfect, thank you!
Comment

Announcement

Combining spatial datasets using coordinates

Comment

Comment

Comment

Comment