Hello,
I am currently trying to connect time series data (with long/lat attributes for each observation) to a different dataset that contains district centroids. I want to determine which grid points lie within 100 KM of each district center. I am using geodist for this and it works fine if I limit the merge to just a single date (there are no overlapping variables so I have to use "cross"). I then only end up with data for 1/1/09 this way and also need the other 364 days. Is there an easier way to do this than my code below?
Any help would be very greatly appreciated and let me know if I left out any relevant information!
I am currently trying to connect time series data (with long/lat attributes for each observation) to a different dataset that contains district centroids. I want to determine which grid points lie within 100 KM of each district center. I am using geodist for this and it works fine if I limit the merge to just a single date (there are no overlapping variables so I have to use "cross"). I then only end up with data for 1/1/09 this way and also need the other 364 days. Is there an easier way to do this than my code below?
Code:
local satafiles: dir . files "*.csv" foreach file of local satafiles { import delimited using `file', clear gen newdate = mdy(month, day, year) egen newid = group(newdate) bysort newid: gen id=_n save `file'.dta, replace keep if newid==1 cross using griddata.dta geodist latitude longitude centroid_latitude centroid_longitude, gen(dist) merge m:m id using `file'.dta save `file'.dta, replace } }