Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging two datasets by lagged observations

    Hi,
    I have two large datasets on foreign aided projects and firm level surveys. Snapshot is below. I want to merge firm level data with the aid dataset with a 2-year lag in a way that for each firm we have the number and the amount of projects implemented in the firm's province, two years before the year of survey. Shape area and sahpe length are geographical coordinates of respective provinces/states in both datasets.
    Kindly advise on the appropriate code to merge
    Kind regards,
    Click image for larger version

Name:	Screenshot (808).png
Views:	1
Size:	58.5 KB
ID:	1688531

  • #2
    Something like this--modify as needed to work with your actual Stata data sets:
    Code:
    use aid_data, clear
    collapse (count) projectid (sum) amount (first) shape_length shape_area, ///
        by(country province year)
    tempfile aid_summary
    save `aid_summary'
    
    use firm_data, clear
    gen year = survey_year - 2
    merge m:1 country province year using `aid_summary'
    drop year
    This code could not be made exact, nor tested, because a screenshot of a spreadsheet is simply not a helpful way to show example data for purposes of developing and verifying code. In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Thanks for the guidance and apologies for sharing the excel snapshot. Here is how my data looks like with dataex

      Firm Data:


      copy starting from the next line -----------------------
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input double(idstd year) str32 country str43 province_admn1 double(longitude latitude shape_leng    shape_area)
      454389 2005 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      454464 2005 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      454374 2005 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      454406 2005 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      454459 2005 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      454476 2005 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      454478 2005 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      454425 2005 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      454474 2005 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      454529 2005 "Afghanistan" "Kandahar"          65.71576         31.62057 12.3273 5.0527
      454422 2005 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      454489 2005 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      454484 2005 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      454463 2005 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      454618 2005 "Afghanistan" "Baghlan"   68.6990799999999         35.95418  8.3825 2.1034
      454357 2005 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      454486 2005 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      454550 2005 "Afghanistan" "Kandahar"          65.71576         31.62057 12.3273 5.0527
      454490 2005 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      454471 2005 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      454387 2005 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      454360 2005 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      454457 2005 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      454597 2005 "Afghanistan" "Baghlan"   68.6990799999999         35.95418  8.3825 2.1034
      454483 2005 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      454378 2005 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      454663 2005 "Afghanistan" "Nangarhar"         70.46013 34.4301999999999  4.7164  .7311
      454384 2005 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      454614 2005 "Afghanistan" "Baghlan"   68.6990799999999         35.95418  8.3825 2.1034
      454458 2005 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      454610 2005 "Afghanistan" "Baghlan"   68.6990799999999         35.95418  8.3825 2.1034
      454661 2005 "Afghanistan" "Nangarhar"         70.46013 34.4301999999999  4.7164  .7311
      454667 2005 "Afghanistan" "Nangarhar"         70.46013 34.4301999999999  4.7164  .7311
      454438 2005 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      454456 2005 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      451204 2008 "Afghanistan" ""                         .                .       .      .
      451088 2008 "Afghanistan" "Nangarhar"         70.46013 34.4301999999999  4.7164  .7311
      450902 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      451266 2008 "Afghanistan" ""                         .                .       .      .
      451281 2008 "Afghanistan" "Baghlan"   68.6990799999999         35.95418  8.3825 2.1034
      451502 2008 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      451346 2008 "Afghanistan" "Baghlan"   68.6990799999999         35.95418  8.3825 2.1034
      450976 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      451662 2008 "Afghanistan" "Kandahar"          65.71576         31.62057 12.3273 5.0527
      450796 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      451397 2008 "Afghanistan" "Baghlan"   68.6990799999999         35.95418  8.3825 2.1034
      450790 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      451661 2008 "Afghanistan" "Kandahar"          65.71576         31.62057 12.3273 5.0527
      451082 2008 "Afghanistan" "Nangarhar"         70.46013 34.4301999999999  4.7164  .7311
      450887 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      450800 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      451471 2008 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      451666 2008 "Afghanistan" "Kandahar"          65.71576         31.62057 12.3273 5.0527
      451532 2008 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      451564 2008 "Afghanistan" "Kandahar"          65.71576         31.62057 12.3273 5.0527
      450936 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      450764 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      451372 2008 "Afghanistan" "Baghlan"   68.6990799999999         35.95418  8.3825 2.1034
      450794 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      451026 2008 "Afghanistan" ""                         .                .       .      .
      451145 2008 "Afghanistan" "Nangarhar"         70.46013 34.4301999999999  4.7164  .7311
      451408 2008 "Afghanistan" "Baghlan"   68.6990799999999         35.95418  8.3825 2.1034
      450942 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      451434 2008 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      450930 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      450705 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      450807 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      451259 2008 "Afghanistan" ""                         .                .       .      .
      451484 2008 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      451654 2008 "Afghanistan" "Kandahar"          65.71576         31.62057 12.3273 5.0527
      451403 2008 "Afghanistan" "Baghlan"   68.6990799999999         35.95418  8.3825 2.1034
      450997 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      450918 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      451409 2008 "Afghanistan" "Baghlan"   68.6990799999999         35.95418  8.3825 2.1034
      451122 2008 "Afghanistan" "Nangarhar"         70.46013 34.4301999999999  4.7164  .7311
      450789 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      450927 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      450770 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      450990 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      451019 2008 "Afghanistan" ""                         .                .       .      .
      451516 2008 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      451745 2008 "Afghanistan" ""                         .                .       .      .
      450793 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      451048 2008 "Afghanistan" ""                         .                .       .      .
      450940 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      451565 2008 "Afghanistan" "Kandahar"          65.71576         31.62057 12.3273 5.0527
      450798 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      450910 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      451469 2008 "Afghanistan" "Hirat"              62.2164         34.35083 14.0084 5.3537
      451610 2008 "Afghanistan" "Kandahar"          65.71576         31.62057 12.3273 5.0527
      451727 2008 "Afghanistan" ""                         .                .       .      .
      450908 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      450757 2008 "Afghanistan" "Kabul"     69.1776799999999 34.5260099999999  3.7114  .4385
      451258 2008 "Afghanistan" ""                         .                .       .      .
      451091 2008 "Afghanistan" "Nangarhar"         70.46013 34.4301999999999  4.7164  .7311
      451083 2008 "Afghanistan" "Nangarhar"         70.46013 34.4301999999999  4.7164  .7311
      451759 2008 "Afghanistan" ""                         .                .       .      .
      451674 2008 "Afghanistan" ""                         .                .       .      .
      451140 2008 "Afghanistan" "Nangarhar"         70.46013 34.4301999999999  4.7164  .7311
      451289 2008 "Afghanistan" "Baghlan"   68.6990799999999         35.95418  8.3825 2.1034
      end
      copy up to and including the previous line ------------------


      And my AID DATA that I collapsed as under

      collapse (firstnm) country place_name state_province_admn1 region (mean) Project_Cat latitude ///
      longitude shape_leng shape_area (count) project_id1 (sum) total_commitments ibrdcommamt ///
      idacommamt totalcompletioncosts netcommitment total_disbursements totalsupervisioncost, ///
      by (sectorcode transaction_year admn1)


      copy starting from the next line -----------------------
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input int transaction_year str38 country str43 state_province_admn1 double(latitude    longitude shape_leng    shape_area)    long    project_id1    double    total_commitments
      2002 "Afghanistan" "Uruzgan"      32.8333299999999                66  5.9468 1.2387    1  51640540
      2002 "Afghanistan" "Balkh"                   36.75                67  7.1823 1.7388    1  51640540
      2002 "Afghanistan" "Paktya"               33.66667          69.33333  4.4059  .6251    1  51640540
      2002 "Afghanistan" "Takhar"               36.66667             69.75  6.4332 1.2442    1  51640540
      2002 "Afghanistan" "Sari Pul"             36.28647  66.0668799999999  7.5405 1.5937    1  51640540
      2002 "Afghanistan" "Badakhshan"              36.75                72 19.5088 4.4252    2  70083590
      2002 "Afghanistan" "Nuristan"     35.2999999999999          70.83333  5.6526  .9155    1  51640540
      2002 "Afghanistan" "Bamyan"                  34.75             67.25  7.3532 1.3949    2  70083590
      2002 "Afghanistan" "Badghis"                    35             63.75   7.467 2.0365    1  51640540
      2002 "Afghanistan" "Hirat"                    34.5                62 14.0084 5.3537    1  51640540
      2002 "Afghanistan" "Paktika"              32.41667             68.75  7.1931 1.8287    1  51640540
      2002 "Afghanistan" "Baghlan"                 35.75                69  8.3825 2.1034    1  51640540
      2002 "Afghanistan" "Faryab"                  36.25          64.83333  8.4883 2.0306    1  51640540
      2002 "Afghanistan" "Hilmand"                    31                64 11.5198 5.5352    1  51640540
      2002 "Afghanistan" "Kunar"                   34.95          71.13333  3.8584  .4827    1  51640540
      2002 "Afghanistan" "Panjshir"     35.4024799999999          69.98291  3.3382  .3609    1  51640540
      2002 "Afghanistan" "Nangarhar"    34.3333299999999  70.4166699999999  4.7164  .7311    1  51640540
      2002 "Afghanistan" "Baghlan"      36.0011799999999          68.64001  8.3825 2.1034    1  40574708
      2002 "Afghanistan" "Parwan"                     35  68.9166699999999  4.8644  .5813    2  70083590
      2002 "Afghanistan" "Wardak"                  34.25                68  5.3246   .874    1  51640540
      2002 "Afghanistan" "Zabul"                   32.25             67.25  7.5381 1.6578    1  51640540
      2002 "Afghanistan" "Laghman"      34.7833299999999  70.1833299999999  3.1994  .3783    1  51640540
      2002 "Afghanistan" "Daykundi"                33.75             66.25  7.0677 1.7426    1  51640540
      2002 "Afghanistan" "Logar"                      34             69.25  2.9054  .3776    2  70083590
      2002 "Afghanistan" "Jawzjan"      36.8333299999999          65.83333  6.1025  1.192    1  51640540
      2002 "Afghanistan" "Ghor"                       34                65 12.3909 3.5666    1  51640540
      2002 "Afghanistan" "Kapisa"                     35          69.58333  2.2634  .1882    2  70083590
      2002 "Afghanistan" "Nimroz"                     31              62.5  9.5102 3.8617    1  51640540
      2002 "Afghanistan" "Balkh"      36.970394999999954 67.64360499999995  7.1823 1.7388    2  81149416
      2002 "Afghanistan" "Samangan"   36.218716666666666 68.02347999999996   5.307 1.1252    3 121724124
      2002 "Afghanistan" "Khost"                33.38333          69.88333  3.2954  .3897    1  51640540
      2002 "Afghanistan" "Ghazni"       33.3333299999999             67.75  9.3365 2.2184    1  51640540
      2002 "Afghanistan" "Kandahar"                   31             65.75 12.3273 5.0527    1  51640540
      2002 "Afghanistan" "Kabul"        34.4961399999999          69.10549  3.7114  .4385    1  51640540
      2002 "Afghanistan" "Farah"                    32.5              62.5 12.5395 4.6538    1  51640540
      2002 "Afghanistan" "Samangan"                   36  67.6666699999999   5.307 1.1252    1  51640540
      2003 "Afghanistan" "Zabul"                   32.25             67.25  7.5381 1.6578    2 162568692
      2003 "Afghanistan" "Badghis"                    35             63.75   7.467 2.0365    1 114400192
      2003 "Afghanistan" "Laghman"      34.7833299999999  70.1833299999999  3.1994  .3783    2 162568692
      2003 "Afghanistan" "Badghis"                    35             63.75   7.467 2.0365    2 118976196
      2003 "Afghanistan" "Wardak"                  34.25                68  5.3246   .874    2 118976196
      2003 "Afghanistan" "Khost"                33.38333          69.88333  3.2954  .3897    2 118976196
      2003 "Afghanistan" "Baghlan"                 35.75                69  8.3825 2.1034    2 118976196
      2003 "Afghanistan" "Paktika"              32.41667             68.75  7.1931 1.8287    2 162568692
      2003 "Afghanistan" "Sari Pul"             36.28647  66.0668799999999  7.5405 1.5937    2 118976196
      2003 "Afghanistan" "Kabul"       34.49806999999995 69.26107999999995  3.7114  .4385    2 118976196
      2003 "Afghanistan" "Jawzjan"      36.8333299999999          65.83333  6.1025  1.192    1 114400192
      2003 "Afghanistan" "Sari Pul"             36.28647  66.0668799999999  7.5405 1.5937    1 114400192
      2003 "Afghanistan" "Paktya"               33.66667          69.33333  4.4059  .6251    2 118976196
      2003 "Afghanistan" "Nimroz"                     31              62.5  9.5102 3.8617    1 114400192
      2003 "Afghanistan" "Paktya"               33.66667          69.33333  4.4059  .6251    1 114400192
      2003 "Afghanistan" "Bamyan"       34.7187066666666 67.04082333333334  7.3532 1.3949    3 141615396
      2003 "Afghanistan" "Uruzgan"      32.8333299999999                66  5.9468 1.2387    1 114400192
      2003 "Afghanistan" "Kandahar"                   31             65.75 12.3273 5.0527    1  47205132
      2003 "Afghanistan" "Balkh"                   36.75                67  7.1823 1.7388    2 118976196
      2003 "Afghanistan" "Faryab"                  36.25          64.83333  8.4883 2.0306    1 114400192
      2003 "Afghanistan" "Uruzgan"      32.8333299999999                66  5.9468 1.2387    1  47205132
      2003 "Afghanistan" "Zabul"                   32.25             67.25  7.5381 1.6578    1  47205132
      2003 "Afghanistan" "Kapisa"                     35          69.58333  2.2634  .1882    1  47205132
      2003 "Afghanistan" "Takhar"               36.66667             69.75  6.4332 1.2442    1  47205132
      2003 "Afghanistan" "Ghazni"       33.3333299999999             67.75  9.3365 2.2184    1 114400192
      2003 "Afghanistan" "Hilmand"                    31                64 11.5198 5.5352    2 118976196
      2003 "Afghanistan" "Parwan"                     35  68.9166699999999  4.8644  .5813    2 118976196
      2003 "Afghanistan" "Nangarhar"    34.3333299999999  70.4166699999999  4.7164  .7311    1 114400192
      2003 "Afghanistan" "Takhar"               36.66667             69.75  6.4332 1.2442    1 114400192
      2003 "Afghanistan" "Nuristan"     35.2999999999999          70.83333  5.6526  .9155    2 162568692
      2003 "Afghanistan" "Samangan"                   36  67.6666699999999   5.307 1.1252    1 114400192
      2003 "Afghanistan" "Farah"                    32.5              62.5 12.5395 4.6538    1 114400192
      2003 "Afghanistan" "Badakhshan"              36.75                72 19.5088 4.4252    1 114400192
      2003 "Afghanistan" "Hirat"                    34.5                62 14.0084 5.3537    1 114400192
      2003 "Afghanistan" "Kandahar"                   31             65.75 12.3273 5.0527    1 114400192
      2003 "Afghanistan" "Baghlan"              36.32488           68.6234  8.3825 2.1034    1 130054952
      2003 "Afghanistan" "Balkh"                   36.75                67  7.1823 1.7388    1 114400192
      2003 "Afghanistan" "Daykundi"                33.75             66.25  7.0677 1.7426    1 114400192
      2003 "Afghanistan" "Hirat"                    34.5                62 14.0084 5.3537    1  47205132
      2003 "Afghanistan" "Baghlan"                 35.75                69  8.3825 2.1034    1 114400192
      2003 "Afghanistan" "Nimroz"                     31              62.5  9.5102 3.8617    2 118976196
      2003 "Afghanistan" "Logar"                      34             69.25  2.9054  .3776    1 114400192
      2003 "Afghanistan" "Panjshir"     35.4024799999999          69.98291  3.3382  .3609    2 118976196
      2003 "Afghanistan" "Nangarhar"    34.3333299999999  70.4166699999999  4.7164  .7311    1  47205132
      2003 "Afghanistan" "Kapisa"                     35          69.58333  2.2634  .1882    1 114400192
      2003 "Afghanistan" "Paktika"              32.41667             68.75  7.1931 1.8287    1  47205132
      2003 "Afghanistan" "Faryab"                  36.25          64.83333  8.4883 2.0306    1  47205132
      2003 "Afghanistan" "Nuristan"     35.2999999999999          70.83333  5.6526  .9155    1  47205132
      2003 "Afghanistan" "Badakhshan"   36.7566533333333 71.37717666666663 19.5088 4.4252    3 166181328
      2003 "Afghanistan" "Jawzjan"      36.8333299999999          65.83333  6.1025  1.192    1  47205132
      2003 "Afghanistan" "Laghman"      34.7833299999999  70.1833299999999  3.1994  .3783    1  47205132
      2003 "Afghanistan" "Bamyan"                  34.75             67.25  7.3532 1.3949    2 162568692
      2003 "Afghanistan" "Farah"                    32.5              62.5 12.5395 4.6538    2 118976196
      2003 "Afghanistan" "Kabul"       34.49480499999995          69.21684  3.7114  .4385    2 228800384
      2003 "Afghanistan" "Ghor"                       34                65 12.3909 3.5666    1  47205132
      2003 "Afghanistan" "Kunar"                   34.95          71.13333  3.8584  .4827    2 162568692
      2003 "Afghanistan" "Samangan"                   36  67.6666699999999   5.307 1.1252    2 118976196
      2003 "Afghanistan" "Ghor"                       34                65 12.3909 3.5666    1 114400192
      2003 "Afghanistan" "Daykundi"                33.75             66.25  7.0677 1.7426    1  47205132
      2003 "Afghanistan" "Khost"                33.38333          69.88333  3.2954  .3897    1 114400192
      2003 "Afghanistan" "Kunar"                   34.95          71.13333  3.8584  .4827    1  47205132
      2003 "Afghanistan" "Hilmand"                    31                64 11.5198 5.5352    1 114400192
      2003 "Afghanistan" "Wardak"                  34.25                68  5.3246   .874    1 114400192
      2003 "Afghanistan" "Logar"                      34             69.25  2.9054  .3776    1  47205132
      end
      copy up to and including the previous line ------------------


      Kindly advise me or you think otherwise comapring the code you suggested earlier please?

      Kind regards,



      Comment


      • #4
        So, with some modifications, mostly using the appropriate variable names, you are probably OK, starting from the firms data set and the (partially) collapsed aid data set.

        Code:
        use aid_dataset, clear
        collapse (count) project_id1 (sum) total_commitments (first) shape_leng shape_area, ///
            by(country state_province_admn1 transaction_year)
        tempfile aid_summary
        save `aid_summary'
        
        use firms_dataset, clear
        gen transaction_year = year - 2
        rename province_admn1 state_province_admn1
        merge m:1 country state_province_admn1 transaction_year using `aid_summary', update replace
        drop transaction_year
        assert _merge != 5
        tab _merge
        I call your attention to the boldfaced code that represents a substantive change from previously suggested. When I ran the code on your example data, I noted that there are many units (combination of country, state_province_admn1, and transaction_year) in each data set that find no match in the other. Perhaps that won't happen in your full data, and perhaps it is OK even if it does. But you should be aware of it. That is the reason for the -tab _merge- at the end. You will see how many such there are, and you can -list- or -browse- the observations that have _merge == 1 or _merge == 2 to see if these represent a problem or not. The -update replace- options to the -merge and -assert _merge != 5- are added because both files contain the longitude, latitude, and shape data. The added code looks to see if there are any inconsistencies between the two files on those variables. If there are, the -assert- command will interrupt program execution with an error message. You will then find the offending observations in the data characterized by _merge == 5. Any such contradictions are, of course, a problem that you will then need to resolve, determining which of the conflicting values (if either) is correct before proceeding farther. (If there are no conflicting data, the -assert- command will just produce no output and will not interrupt the flow of the program.)

        Comment

        Working...
        X