Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combining shapefiles in Stata

    Dear Statalisters

    I'm trying to draw maps of England and Wales on the regional level, i.e. the nine macro-regions of England, plus Wales. I have two shapefiles: one for the regions of England, and one for the countries of Great Britain (England, Scotland and Wales), where I want to extract the boundaries of Wales and add to the English regions (full disclosure: there was a problem with the country shapefile, because it included a variable called "long" that shp2dta couldn't deal with, so I first modified the shapefile in R to remove this variable). There is an earlier thread that discusses a similar problem, but unfortunately these steps are not working for me.

    I'm following the steps outlined by Roberto Liebscher in the tread mentioned above, simply appending the Wales boundaries to the English boundaries. However, whenever I try to draw a map (using Maurizio Pisati's spmap), Stata "freezes" (no error message, just a popup from Windows saying that the programme stopped responding) . Drawing the two maps individually is not a problem. It is also possible to add Wales as a polygon to the base map of the English regions, but ultimately I would like to draw choropleth maps, so that is not a satisfying option.

    Here's the code I'm using (in Stata 15.0):

    Code:
     shp2dta using Regions_December_2015_Full_Clipped_Boundaries_in_England, data(region_wow_shp) coor(region_wow_coor) genid(id) genc(c) replace
     shp2dta using countries, data(countryshp) coor(countrycoor) genid(id) genc(c) replace
    
     use countryshp, clear
     keep if ctry16nm == "Wales"
     rename ctry16nm rgn15nm
     rename ctry16cd rgn15cd
     drop bng_e bng_n
     replace id = 10 // IDs in region shapefile run from 1 to 9
    
     append using region_wow_shp
     sort id
     save regionshp, replace
    
     use countrycoor, clear
     keep if _ID == 3 // Extracting the Welsh coordinates
     replace _ID = 10 // Assign the same ID as above
    
     append using region_wow_coor
     sort _ID
     save regioncoor, replace
    
    use regionshp, clear
    spmap using regioncoor, id(id) // This is where the programme stops responding
    Many thanks for your time! Any thoughts appreciated!

  • #2
    Do the shapefiles each use the same projections? Perhaps the issue is related to Stata trying to resolve two different coordinate projections in a way that can be rendered easily.

    Comment


    • #3
      wbuchanan I don't think that's the problem. When I map the regions with Wales as a polygon, it falls neatly into place. They're also coming from the same source (geoportal.statistics.gov.uk). But is there a way to formally check this? Thanks!

      Comment


      • #4
        I think that your Stata freezed because the shapefiles you downloaded are too large, far more than what's needed to create a map in Stata. I went to the web site you pointed at and downloaded the shapefiles you used (version 2017). The coordinates file for the countries contains 2.3 million points!

        What you need are simpler maps, which do not try to describe details that cannot be resolved on a Stata map. I downloaded these:
        Code:
        http://geoportal.statistics.gov.uk/datasets/countries-december-2017-generalised-clipped-boundaries-in-great-britain
        http://geoportal.statistics.gov.uk/datasets/regions-december-2017-generalised-clipped-boundaries-in-england
        Both suffer from the same "long" problem you described. That's because the database part of the shapefile contains a variable for the longitude that's called "long" but you cannot create a variable called long since it is used as a Stata datatype. You have to be careful when editing binary files that the replacement string has exactly the same length at the one you start with. The following code replaces all occurrences of "long" with "LONG". This works in this case as this triggers only one change per file and since Stata is case sensitive, it can create a variable called LONG. The code also creates a copy of the coordinates file so that both components of the shapefile used by shp2dta have the same name (with a ."shp" and ".dbf" file extension):
        Code:
        clear all
        
        * the common filestub for all components of the shapefile
        local filestub "Regions_December_2017_Generalised_Clipped_Boundaries_in_England"
        * fix the database part of the shapefile
        filefilter "`filestub'.dbf" "regions.dbf", from("long") to("LONG") replace
        dis r(occurrences)
        * use the same filestub for the .shp part of the shapefile
        copy "`filestub'.shp" "regions.shp", replace
        * show the projection details
        type "`filestub'.prj"
        * convert the database and coordinates part to Stata datasets
        shp2dta using regions, data(region_wow_shp) coor(region_wow_coor) genid(id) genc(c) replace
        
        * the common filestub for all components of the shapefile
        local filestub "Countries_December_2017_Generalised_Clipped_Boundaries_in_Great_Britain"
        * fix the database part of the shapefile
        filefilter "`filestub'.dbf" countries.dbf, from("long") to("LONG") replace
        dis r(occurrences)
        copy "`filestub'.shp" "countries.shp", replace
        * show the projection details
        type "`filestub'.prj"
        * convert the database and coordinates part to Stata datasets
        shp2dta using countries, data(countryshp) coor(countrycoor) genid(id) genc(c) replace
        The following code extracts "Wales" from the countries shapefile and saves both components separately. Note that the coordinates are ordered but there is no variable in the dataset that can be used to restore the order if the data is sorted. I add the obs variable to address this. Finally, I create a map using spmap. It's still slow for my taste but workable.
        Code:
        * create shapefile datasets for Wales
        use "countryshp.dta", clear
        keep if id == 3
        list
        rename ctry17nm rgn17nm
        rename ctry17cd rgn17cd
        drop bng_e bng_n
        replace id = 10 // IDs in region shapefile run from 1 to 9
        save "Wales_data.dta", replace
        use "countrycoor.dta", clear
        gen long obs = _n
        keep if _ID == 3
        replace _ID = 10
        save "Wales_coor.dta", replace
        
        * combine with datasets for regions 1-9
        use "region_wow_shp.dta", clear
        assert inrange(id,1,9)
        append using "Wales_data.dta"
        isid id, sort
        sum id
        save "regionW_data.dta", replace
        use "region_wow_coor.dta", clear
        gen long obs = _n
        append using "Wales_coor.dta"
        isid _ID obs, sort
        save "regionW_coor.dta", replace
        
        * create a map
        use "regionW_data.dta", clear
        spmap using "regionW_coor.dta", id(id)
        and the resulting map:

        Click image for larger version

Name:	my_map.png
Views:	1
Size:	323.1 KB
ID:	1436446

        Comment


        • #5
          Dear Robert
          That worked, thank you so much! I suspect the problem was around the ID variables and the sorting. But the code runs much quicker now with the generalised boundaries.

          Comment


          • #6
            Hi everyone,

            I am trying to create a map in stata to show which major roads (highways and secondary) pass through each district and where are major cities located in districts or provinces. I have the shapefiles for administrative provincial/district boundaries, road network, and major cities. I am trying to combine/merge these shapefiles but there is no common id (identifier). Can anyone help please? i tried to use the information in this thread but still not able to do it. I can't attach my files but i hope i made my point clear.

            Comment


            • #7
              Saw the problem of 'long' variable name not imported, but the variable is there, just renamed as 'var1'

              Comment

              Working...
              X