Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • shp2dta doesn't recognize dbf file

    Dear all,
    I have a question concerning the command shp2dta in Stata 13.1:

    In order to later use the command spmap to map some of my data, I would like to use shp2dta
    in order to convert my shp files to dta files. Since I want to create a map of the European Union divided into small regions as definced by the statistical NUTS 2016 specification, I want to use the publicly available shp files provided by Eurostat.

    I didn't find a way to use e.g. dataex to post a convinient data example, so I'm left with openly stating which data I want to use: The data can be found here: https://ec.europa.eu/eurostat/de/web...cal-units/nuts , NUTS 2016, 1:1 Million, File format: SHP -> open e.g. ZIP file NUTS_BN_01M_2016_3857.shp.zip -> download NUTS_BN_01M_2016_3857.shp and NUTS_BN_01M_2016_3857.dbf .

    My commands so far are:

    Code:
    cd: "C:\SHP-files"
    shp2dta using NUTS_BN_01M_2016_3857, database(eudb) coordinates(eucoord) genid(id)
    (database(eudb) specifies that the database file is named eudb.dta, coordinates(eucoord) specifies that the coordinate file is named eucoord.dta and genid(id) specifies that the ID variable created in eudb.dta is named "id".)

    Executing these commands provides me with only one new file (“eucoord.dta”) and Stata issues the message:

    type: 3
    NUTS_BN_01M_2016_3857.dbf: invalid dBASE data type
    r(610);

    ... stating that my designated file is not a Stata-format file. What am I doing wrong?

    On a side note, I have tried the following: opening the dbf file in Excel, copy-pasting the shown data into the Stata Data Editor, checking “Treat first row as variable names”, and then manually saving this as eudb.dta in “C:\SHP_files”. This way, I am able to create maps, e.g. (a rather useless one) indicating the number of the variable nuts_bn_id:

    Code:
    gen number = nuts_bn_id
    spmap number using eucoord, id(nuts_bn_id)
    Does somebody have an idea why Stata might not recognize the dbf file as a valid data type when using shp2dta?

    Thank you in advance and Best wishes,
    Françoise





  • #2
    Originally posted by Francoise Diebold View Post
    Does somebody have an idea why Stata might not recognize the dbf file as a valid data type when using shp2dta?
    First, shp2dta, although written by Kevin Crow (StataCorp) is not an official part of Stata, so the above statement is incorrect, since technically it is a 'user-written package for Stata' where the author happened to be a Stata employee. To confirm, in vanilla Stata if you type which shp2dta you get
    Code:
    . which shp2dta
    command shp2dta not found as either built-in or ado-file
    r(111);
    Next, assuming we realize that the shp2dta is a user-written additional module for Stata, we also have to acknowledge that it is already at least 13 years old. This is important, because what you think is a DBF-file format is in fact a huge collection of various format variations, with various features, limitations and compatibility issues completely stirred a few years ago by the need to accommodate Unicode already at the time when the producers of DBASE have lost (imho) the leading positions on the database market.

    The following table is just a tip of the iceberg:
    Code:
    • 0x02 FoxBASE
    • 0x03 FoxBASE+/Dbase III plus, no memo
    • 0x30 Visual FoxPro
    • 0x31 Visual FoxPro, autoincrement enabled
    • 0x32 Visual FoxPro with field type Varchar or Varbinary
    • 0x43 dBASE IV SQL table files, no memo
    • 0x63 dBASE IV SQL system files, no memo
    • 0x83 FoxBASE+/dBASE III PLUS, with memo
    • 0x8B dBASE IV with memo
    • 0xCB dBASE IV SQL table files, with memo
    • 0xF5 FoxPro 2.x (or earlier) with memo
    • 0xE5 HiPer-Six format with SMT memo file
    • 0xFB FoxBASE
    Most notably, DBASE format allowed for long strings long before the long strings appeared in Stata. (the 'with memo' flavors marked in the table above). This is a likely reason why Kevin has protected his code from some non-standard or unknown variations by restricting the file type to be 3. Only three. No other type:

    Code:
        // Get .dbase version
        ver = fbufget(C, fh_in, "%1b")
        bufbyteorder(C, 2)
    
        // Get year of file
        year = fbufget(C, fh_in, "%1bu") + 1900
    
        if (ver!=3| year<1900 | year>2050) {
            errprintf("%s: invalid dbase (.dbf) file\n", dbf_file)
            exit(610)
        }
    And this is exactly the message you are getting. So, either

    1. Contact the composer of the map files you are using and indicate you need compatibility with this [sub]format, or
    2. Use external tools to manipulate the file type yourself before using shp2dta. You will find thousands of them if you google it.
    3. Or use a GIS package which would allow you to open the file and resave, and hope that it will either ask you for version, or choose type 3 automatically.
    4. Since Stata 15.0 (?check?), Stata supports import/export of DBF files standardly, but this feature is not utilized by shp2dta since it was apparently last updated a few years before Stata 15. In my geotools package I also preferred to compile a custom writer.
    5. Give a try to the ODBC drivers to read-in and resave the DBF file.

    So while this doesn't solve your problem, I hope it clarifies what's going on. That said, shp2dta by Kevin Crow is wonderful program that has been serving it's purpose for almost as long as I am a Stata user.

    Best, Sergiy Radyakin

    Comment


    • #3
      Fantastic, thank you very much, Sergiy!
      These were some really helpful background information (that honestly took me a while to fully understand) ... I went with your tip number 3, downloaded QGIS and resaved the entire shapefile. With these files, shp2dta runs smoothly without any problems!

      Best,
      Françoise

      Comment


      • #4
        Hi, I have the same problem with the same data. May I ask which procedure you followed to open and resave the .dbf file using QGIS?

        Comment

        Working...
        X