Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • merging unit level data with shape file

    Dear all, I am working with firm level data with sub-region as the spatial unit identifiable. Further, there is no information in terms of distance from center of the sub-region to define exact location of firm within sub-region. As I am interest in spatial analysis so the data has to merged with shape file, now my problem is: I am not sure whether this is possible to do or not and if possible how can I go ahead. I tried this in R and it returned error massage that means there is multiple records in data per sub-region.

    Thanks a lot

  • #2
    Hi Prakash,

    You may want to clarify a few things to receive some more help.
    1) Does the firm level data have a variable for a unique identifier?
    2) Is that unique identifier the same in the shape file?

    From what I gathered with the error you received in R it sounds like you may have more than one firm in each sub region. Would all the firms in the same sub region receive the same information from the shape file? I think your answer to this question will be yes. I can't imagine the same region having more than one shape.

    if your answer is yes to all the questions I asked I think your solution will be to use the following code. Note:I am assuming that your master data set is the firm data set, I am calling the unique identifier "id" and calling the shape file you are merging "shape" while supposing the file is saved on your desktop, you would need to adjust all these to match what you have.
    Code:
    merge m:1 id using "/Desktop/shape.dta"

    Comment


    • #3
      Patrick, answer to your question is: the spatial unit is sub-region not the firm so there is no unique identifier at firm level in the shape file corresponding to firm. My interest is to study the spatial effect of firms from one sub-region on behaviour of firm of another sub-region and vice a versa.

      Please ask me if I am not making my self clear

      Comment


      • #4
        A shapefile contains information on the geometry of geographical features (i.e. coordinates) and a database of attributes for each shape (e.g. population, area, income, etc.).

        Usually, a shapefile is converted to Stata format using shp2dta (from SSC). The coordinates dataset uses _ID to identify the shape and _X _Y for the coordinates (these can be longitude and latitude or projected (x,y) coordinates). The database dataset contains one observation per feature.

        If you want to merge firm level data using a sub-region identifier to the database part of the shapefile and both datasets include the identifier, then something like

        Code:
        merge m:1 subregion using "shape_database", keep(master match) nogen
        should be all that is needed.

        If you are trying to determine in which sub-region each firm is located by using the firm's location (lat/lon) and a shapefile of sub-regions, then you can use geoinpoly (from SSC) if both the firm and the shapefile coordinates use the same reference system.

        Comment


        • #5
          Thanks Robert, I well received your point. I understand the limitation of my data. In the firm level data there there is no way I can identify the the exact location (lat/lon) of the firm. What is known to me is that the particular firm belong to this particular sub-region and so on. Having said that, I would like to ask you all that is it possible to merge the data with shapefile and do spatial analysis where I can include the firm level control variable also.

          Thanks a lot.

          Comment


          • #6
            So you don't have the location of firms and this is not a shapefile of the sub-regions (otherwise the merge approach would work).

            To matching each firm to a shapefile, your next option is to use the location of the sub-region as a proxy for the location of the firm. If you have a shapefile of the sub-regions, you can use the sub-region's centroid. If you don't have that, you need to find the location of some point in the sub-region, perhaps an administrative unit and use that as a proxy for the location of the firm.

            As to the implication of using a proxy location in terms of spatial analysis, that's way above my pay grade.

            Comment


            • #7
              Yes Robert, the exact identifiable spatial unit in the shapefile and data is sub-region and the firms are located within the sub-region but there is no information about there lat/lon. I am stuck at this point and not even sure that what you have suggested would work or not. Also I have to work on your suggestion but not clear how I can do this.
              Anyway thanks for your insight, it gives me hope at least.

              Thanks a lot

              Comment

              Working...
              X