Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • spmap basemap problems

    When I cut down Maurizio's Italy basemap to just 6 points it works fine with a simple master.

    +----+
    | id |
    |----|
    1. | 1 |
    2. | 2 |
    3. | 3 |
    +----+

    +-----------------------------+
    | _ID _X _Y |
    |-----------------------------|
    1. | 1 . . |
    2. | 1 454919.74 4399911.4 |
    3. | 1 454936.49 4402763.1 |
    4. | 1 451735.05 4379968.2 |
    5. | 1 451842.11 4397078.2 |
    6. | 1 454919.74 4399911.4 |
    +-----------------------------+


    When I repeat the process with London long/lat coordinates and the same master I get a "master data not sorted" error:

    +----------------------------+
    | _ID _X _Y |
    |----------------------------|
    1. | 1 . . |
    2. | 1 -.227731 51.530176 |
    3. | 1 -.227603 51.530166 |
    4. | 1 -.2276 51.530166 |
    5. | 1 -.227693 51.529938 |
    6. | 1 -.227697 51.529928 |
    7. | 1 -.227731 51.530176 |
    +----------------------------+


    Where am going wrong?

    Thanks,

    Paul

  • #2
    Paul:
    welcome to the list.
    Unfortunately, as Maurizio Pisati doesn't seem to be a regular contributor of this list anymore, you should better try to contact him directly at the e-mail address reported in -net describe spmap, from(http://fmwww.bc.edu/RePEc/bocode/s)-
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Dear Paul,
      without looking at the datasets you used and without knowing the full set of commands you submitted, it's hard to tell. However, it looks like your basemap dataset was not sorted by variable _ID. Anyway, you can find a detailed description of the data format required by spmap in the help file of the program.
      Best wishes,
      Maurizio

      Comment


      • #4
        Thanks Maurizo,
        The data above it the full dataset for basemap and master dataset.
        I am using only one ID in the basemap. You can try it if you have time.
        Removing the id 2 and 3 in the master does not help.
        Any ideas?

        By the way you have written a great help file.

        Paul

        Comment


        • #5
          Dear Paul,
          I tried to recreate both your master and (second) basemap datasets, and used them to draw the polygon defined in the basemap: Everything worked just fine for me. I created the basemap dataset by entering the values of variables _ID, _X and _Y by hand, and then sorted the dataset by _ID. I created the master dataset by simply entering the three values of variable id by hand. Sorry I can't help with your problem.
          Best wishes,
          Maurizio

          Comment


          • #6
            I can replicate the problem described in the original post using the following example:

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input long _ID double(_X _Y)
            1     .      .
            1  -.22  51.53
            1 -.221  51.53
            1 -.221 51.531
            1  -.22 51.531
            1  -.22  51.53
            end
            save "lonlat_coor.dta", replace
            
            clear
            input long id
            1
            end
            sort id
            
            spmap using "lonlat_coor.dta", id(id)
            It looks like Paul's coordinates file is not sorted (those created by shp2dta are always sorted by _ID). Since the order of observations matters in a coordinates file, you have to be careful how to sort it. Also, since these are latitude and longitude, the map created by spmap will be severely distorted because it plots unprojected coordinates. Here's how to sort the coordinates and how to use geo2xy (from SSC) to convert lat/lon to (x,y). Note that points are .001 degrees apart:

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input long _ID double(_X _Y)
            1     .      .
            1  -.22  51.53
            1 -.221  51.53
            1 -.221 51.531
            1  -.22 51.531
            1  -.22  51.53
            end
            gen obs = _n
            sort _ID obs
            drop obs
            save "lonlat_coor.dta", replace
            
            * project the lat/lon to xy using the same projection as Google Maps
            geo2xy _Y _X, replace
            save "xy_coor.dta", replace
            
            clear
            input long id
            1
            end
            sort id
            
            spmap using "lonlat_coor.dta", id(id)
            spmap using "xy_coor.dta", id(id)

            Comment


            • #7
              I agree with Robert: as I mentioned in my first message, Paul's problem is likely to be generated by the basemap dataset not being sorted by _ID. The spmap help file specifies that "A basemap dataset is always required to be sorted by variable _ID." The strategy suggested by Robert to get a proper sorting works fine. Alternatively, one can use the more direct sort _ID, stable.
              Best wishes,
              Maurizio

              Comment


              • #8
                Thank you both, Maurizio and Robert. This is extraordinary. The problem is the sorting as you both pointed out. Roberts is more explicit and works when I do his sorting (and with Maurizio's neater version). The thing is, sorting on _n makes no difference to the file. I have exported to text files and did a comparison between the original and the sorted versions and they are identical! It is as if Stata knows the file has been put through sorting, a mark left on the file somewhere.

                I spent a weekend trying to get to the bottom of this and I am really delighted for the help you have given me. Now down to the real work.

                Paul

                Comment


                • #9
                  With respect to sort, I strongly prefer explicitly showing all the sort keys and would prefer that the stable option be forgotten. It's a nice shorthand but it does not convey well what it is doing. I also fear that people will just start to append the option to all their sort commands to avoid the all too common insufficiently sorted "bug" (see Why does my do-file or ado-file produce different results every time I run it?).

                  With respect to why you got the error, spmap uses version 9 syntax and therefore all merges are specified using the old syntax (the new merge syntax was adopted in version 11). To maintain backwards compatibility, even Stata 14 will recognize the old syntax and implement the command the way it would have in versions prior to Stata 11. With the old syntax, both master and using datasets had to be sorted beforehand. Stata keeps track of the sort order (see the sortedby macro extended function, help extended_fcn) of a dataset, including the one in memory. You can also view the current sort order at the end of the output of the describe command.

                  Here's a quick example of the difference in behavior between the new and the old syntax of merge:

                  Code:
                  clear
                  input id x
                  1 3
                  1 2
                  1 1
                  2 4
                  2 1
                  2 7
                  end
                  tempfile using
                  save "`using'"
                  
                  clear
                  input id
                  1
                  end
                  tempfile master
                  save "`master'"
                  
                  * the new syntax will automatically sort
                  merge 1:m id using "`using'"
                  
                  * the old syntax requires prior sorting of both datasets
                  use "`master'", clear
                  describe
                  merge id using "`using'"

                  Comment

                  Working...
                  X