spmap basemap problems

Paul O'Brien

Join Date: Sep 2014

Posts: 8
#1

spmap basemap problems

21 Sep 2015, 02:53

When I cut down Maurizio's Italy basemap to just 6 points it works fine with a simple master.

+----+
| id |
|----|
1. | 1 |
2. | 2 |
3. | 3 |
+----+

+-----------------------------+
| _ID _X _Y |
|-----------------------------|
1. | 1 . . |
2. | 1 454919.74 4399911.4 |
3. | 1 454936.49 4402763.1 |
4. | 1 451735.05 4379968.2 |
5. | 1 451842.11 4397078.2 |
6. | 1 454919.74 4399911.4 |
+-----------------------------+

When I repeat the process with London long/lat coordinates and the same master I get a "master data not sorted" error:

+----------------------------+
| _ID _X _Y |
|----------------------------|
1. | 1 . . |
2. | 1 -.227731 51.530176 |
3. | 1 -.227603 51.530166 |
4. | 1 -.2276 51.530166 |
5. | 1 -.227693 51.529938 |
6. | 1 -.227697 51.529928 |
7. | 1 -.227731 51.530176 |
+----------------------------+

Where am going wrong?

Thanks,

Paul
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17714
#2

21 Sep 2015, 05:00

Paul:
welcome to the list.
Unfortunately, as Maurizio Pisati doesn't seem to be a regular contributor of this list anymore, you should better try to contact him directly at the e-mail address reported in -net describe spmap, from(http://fmwww.bc.edu/RePEc/bocode/s)-

Kind regards,
Carlo
(Stata 19.0)
Comment
Maurizio Pisati

Join Date: Mar 2014

Posts: 25
#3

21 Sep 2015, 07:52

Dear Paul,
without looking at the datasets you used and without knowing the full set of commands you submitted, it's hard to tell. However, it looks like your basemap dataset was not sorted by variable _ID. Anyway, you can find a detailed description of the data format required by spmap in the help file of the program.
Best wishes,
Maurizio
Comment
Paul O'Brien

Join Date: Sep 2014

Posts: 8
#4

21 Sep 2015, 08:06

Thanks Maurizo,
The data above it the full dataset for basemap and master dataset.
I am using only one ID in the basemap. You can try it if you have time.
Removing the id 2 and 3 in the master does not help.
Any ideas?

By the way you have written a great help file.

Paul
Comment
Maurizio Pisati

Join Date: Mar 2014

Posts: 25
#5

21 Sep 2015, 10:38

Dear Paul,
I tried to recreate both your master and (second) basemap datasets, and used them to draw the polygon defined in the basemap: Everything worked just fine for me. I created the basemap dataset by entering the values of variables _ID, _X and _Y by hand, and then sorted the dataset by _ID. I created the master dataset by simply entering the three values of variable id by hand. Sorry I can't help with your problem.
Best wishes,
Maurizio
Comment

Robert Picard

Join Date: Mar 2014
Posts: 1536

21 Sep 2015, 11:44

I can replicate the problem described in the original post using the following example:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input long _ID double(_X _Y)
1     .      .
1  -.22  51.53
1 -.221  51.53
1 -.221 51.531
1  -.22 51.531
1  -.22  51.53
end
save "lonlat_coor.dta", replace

clear
input long id
1
end
sort id

spmap using "lonlat_coor.dta", id(id)

It looks like Paul's coordinates file is not sorted (those created by shp2dta are always sorted by _ID). Since the order of observations matters in a coordinates file, you have to be careful how to sort it. Also, since these are latitude and longitude, the map created by spmap will be severely distorted because it plots unprojected coordinates. Here's how to sort the coordinates and how to use geo2xy (from SSC) to convert lat/lon to (x,y). Note that points are .001 degrees apart:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input long _ID double(_X _Y)
1     .      .
1  -.22  51.53
1 -.221  51.53
1 -.221 51.531
1  -.22 51.531
1  -.22  51.53
end
gen obs = _n
sort _ID obs
drop obs
save "lonlat_coor.dta", replace

* project the lat/lon to xy using the same projection as Google Maps
geo2xy _Y _X, replace
save "xy_coor.dta", replace

clear
input long id
1
end
sort id

spmap using "lonlat_coor.dta", id(id)
spmap using "xy_coor.dta", id(id)

Comment

Maurizio Pisati

Join Date: Mar 2014

Posts: 25
#7

21 Sep 2015, 12:28

I agree with Robert: as I mentioned in my first message, Paul's problem is likely to be generated by the basemap dataset not being sorted by _ID. The spmap help file specifies that "A basemap dataset is always required to be sorted by variable _ID." The strategy suggested by Robert to get a proper sorting works fine. Alternatively, one can use the more direct sort _ID, stable.
Best wishes,
Maurizio
Comment
Paul O'Brien

Join Date: Sep 2014

Posts: 8
#8

22 Sep 2015, 04:26

Thank you both, Maurizio and Robert. This is extraordinary. The problem is the sorting as you both pointed out. Roberts is more explicit and works when I do his sorting (and with Maurizio's neater version). The thing is, sorting on _n makes no difference to the file. I have exported to text files and did a comparison between the original and the sorted versions and they are identical! It is as if Stata knows the file has been put through sorting, a mark left on the file somewhere.

I spent a weekend trying to get to the bottom of this and I am really delighted for the help you have given me. Now down to the real work.

Paul
Comment
Robert Picard

Join Date: Mar 2014

Posts: 1536
#9

22 Sep 2015, 09:36

With respect to sort, I strongly prefer explicitly showing all the sort keys and would prefer that the stable option be forgotten. It's a nice shorthand but it does not convey well what it is doing. I also fear that people will just start to append the option to all their sort commands to avoid the all too common insufficiently sorted "bug" (see Why does my do-file or ado-file produce different results every time I run it?).

With respect to why you got the error, spmap uses version 9 syntax and therefore all merges are specified using the old syntax (the new merge syntax was adopted in version 11). To maintain backwards compatibility, even Stata 14 will recognize the old syntax and implement the command the way it would have in versions prior to Stata 11. With the old syntax, both master and using datasets had to be sorted beforehand. Stata keeps track of the sort order (see the sortedby macro extended function, help extended_fcn) of a dataset, including the one in memory. You can also view the current sort order at the end of the output of the describe command.

Here's a quick example of the difference in behavior between the new and the old syntax of merge:

Code:

clear input id x 1 3 1 2 1 1 2 4 2 1 2 7 end tempfile using save "`using'" clear input id 1 end tempfile master save "`master'" * the new syntax will automatically sort merge 1:m id using "`using'" * the old syntax requires prior sorting of both datasets use "`master'", clear describe merge id using "`using'"
Comment

Announcement