Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problems with merge command - panel data

    Dear Stata community,

    I am constructing my dataset (a panel data) in order to estimate a gravity model further on.

    I am having issues merging some variables from a gravity database onto my main dataset. The latter includes bilateral trade by HS6 product codes that I have selected, through years 1996-2020.

    Whenever I try to merge a variable such as distance, an error occurs:

    merge 1:m year iso3num_o iso3num_d dist using rawdata\Comtrade\Gravity_V202211
    variables year iso3num_o iso3num_d dist do not uniquely identify observations in the master data
    r(459)

    I understand that possibly the problem may come from my main dataset, as there is a repetition in the years (as a pair can trade more than one product in the same year). I have then tried to delete the dupes by replacing the HS6 product codes by the NACE revision classification, which groups several of the HS product codes into one code, and then by doing the total of the value and the quantity, to eliminate the duplicates that were there. However, the issue still remains.

    I have also tried to delete from the gravity dataset some observations that aren't useful (such as the years previous to 1996, and trade pairs that are not listed into my main dataset). However, this seems to be quite cumbersome and I am not sure that it would actually solve the issue.

    Is there a way to go through this problem?

    Below you can see the dataex of my main dataset:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int(year iso3num_o iso3num_d) str6 hs6 float value double quantity str32 exporter_name str52 importer_name float bilateral
    1996 842  4 "854190"  13.049      . "USA"                  "Afghanistan" 20808
    1996 842  4 "853400"   75.27   .084 "USA"                  "Afghanistan" 20808
    1996 842  4 "854250"  84.861   .124 "USA"                  "Afghanistan" 20808
    1996 246  4 "854140"    1.11    .04 "Finland"              "Afghanistan"  6871
    1996 842  4 "854230"  26.436   .006 "USA"                  "Afghanistan" 20808
    1996 842  4 "854110"     5.5   .002 "USA"                  "Afghanistan" 20808
    1996 842  4 "854290"  24.264      . "USA"                  "Afghanistan" 20808
    1996 842  4 "854140"   2.841      . "USA"                  "Afghanistan" 20808
    1996 842  4 "854212"     7.6   .011 "USA"                  "Afghanistan" 20808
    1996 842  4 "854214"  86.694   .427 "USA"                  "Afghanistan" 20808
    1996 842  4 "854213"  87.871   .625 "USA"                  "Afghanistan" 20808
    1996 842  4 "854219"  23.414   .015 "USA"                  "Afghanistan" 20808
    1996 484  4 "853400"   4.559   .359 "Mexico"               "Afghanistan" 12635
    1996 842  4 "854240"   7.324   .002 "USA"                  "Afghanistan" 20808
    1996 381  8 "853400"  81.715   .887 "Italy"                "Albania"      9882
    1996 251  8 "854212"  87.164  1.312 "France"               "Albania"      7140
    1996  40  8 "854213"  72.171   .066 "Austria"              "Albania"      1089
    1996 381  8 "854250" 176.064      . "Italy"                "Albania"      9882
    1996 276  8 "853400"   1.845   .014 "Germany"              "Albania"      7663
    1996 792  8 "854190"   7.266  2.125 "Turkey"               "Albania"     20471
    1996 381  8 "903141"   1.213   .053 "Italy"                "Albania"      9882
    1996 203  8 "854110"    .663   .009 "Czechia"              "Albania"      5213
    1996  40  8 "854214"   4.156   .001 "Austria"              "Albania"      1089
    1996 842  8 "854290"  31.759      . "USA"                  "Albania"     20809
    1996 372  8 "853400"   1.906   .222 "Ireland"              "Albania"      9505
    1996 251  8 "854230"   3.126   .005 "France"               "Albania"      7140
    1996 381  8 "854213"   3.221   .074 "Italy"                "Albania"      9882
    1996  58  8 "854290"  66.911   .014 "Belgium-Luxembourg"   "Albania"      1996
    1996 300  8 "854110"    .668   .001 "Greece"               "Albania"      8036
    1996 276  8 "854213"    .664      . "Germany"              "Albania"      7663
    1996 276  8 "854230"   1.163      . "Germany"              "Albania"      7663
    1996 381  8 "854129"   9.284   .003 "Italy"                "Albania"      9882
    1996 300  8 "854121"   4.557    .05 "Greece"               "Albania"      8036
    1996 381  8 "854110"   2.365      . "Italy"                "Albania"      9882
    1996 381  8 "854219"   5.849   .047 "Italy"                "Albania"      9882
    1996 757  8 "854129"  17.031    .21 "Switzerland"          "Albania"     19433
    1996 348  8 "854213"      10      . "Hungary"              "Albania"      8635
    1996 807  8 "854212"   1.108   .003 "TFYR of Macedonia"    "Albania"     19729
    1996 381  8 "854212"  19.383   .081 "Italy"                "Albania"      9882
    1996 381  8 "854290"  24.037   .824 "Italy"                "Albania"      9882
    1996 348  8 "903141"  13.045     .8 "Hungary"              "Albania"      8635
    1996  40  8 "854110"   1.228   .034 "Austria"              "Albania"      1089
    1996  40  8 "854212"   5.357      . "Austria"              "Albania"      1089
    1996 381  8 "854121"  23.484    .01 "Italy"                "Albania"      9882
    1996 276  8 "854190"  57.584  1.187 "Germany"              "Albania"      7663
    1996 276  8 "854129"   1.328   .004 "Germany"              "Albania"      7663
    1996  58  8 "903141"   1.189   .018 "Belgium-Luxembourg"   "Albania"      1996
    1996 842  8 "854250" 112.423      . "USA"                  "Albania"     20809
    1996 842  8 "854110" 148.576   .029 "USA"                  "Albania"     20809
    1996 807  8 "854230"   1.458   .004 "TFYR of Macedonia"    "Albania"     19729
    1996 381  8 "854140"  18.992   .613 "Italy"                "Albania"      9882
    1996 156 12 "853400"   3.392   .476 "China"                "Algeria"      3825
    1996 156 12 "854110"  17.364  2.077 "China"                "Algeria"      3825
    1996 251 12 "854240" 203.007   .345 "France"               "Algeria"      7141
    1996 381 12 "854230"  99.475   .203 "Italy"                "Algeria"      9883
    1996 372 12 "854212"  31.513   .468 "Ireland"              "Algeria"      9506
    1996 410 12 "903141"  30.343  1.312 "Rep. of Korea"        "Algeria"     16029
    1996 528 12 "854250"  16.046   .153 "Netherlands"          "Algeria"     13582
    1996  40 12 "854140"   1.889   .049 "Austria"              "Algeria"      1090
    1996 752 12 "854250"   1.039   .002 "Sweden"               "Algeria"     19221
    1996 826 12 "854290"   7.766   .003 "United Kingdom"       "Algeria"     21494
    1996 344 12 "854121"  21.227    .14 "China, Hong Kong SAR" "Algeria"      4041
    1996 752 12 "854121"  34.451   .261 "Sweden"               "Algeria"     19221
    1996 579 12 "854250"   6.552   .002 "Norway"               "Algeria"     14345
    1996 392 12 "854140"   6.379   .005 "Japan"                "Algeria"     10194
    1996 251 12 "854290" 382.259 10.445 "France"               "Algeria"      7141
    1996 251 12 "854214"  28.285   .236 "France"               "Algeria"      7141
    1996 702 12 "854129"  27.113  6.375 "Singapore"            "Algeria"     17699
    1996 842 12 "854160"   9.144   .004 "USA"                  "Algeria"     20810
    1996 826 12 "854212"   8.005   .062 "United Kingdom"       "Algeria"     21494
    1996 392 12 "854212"  97.832   .738 "Japan"                "Algeria"     10194
    1996 826 12 "854213"  40.325    .05 "United Kingdom"       "Algeria"     21494
    1996 381 12 "854240" 150.248   .984 "Italy"                "Algeria"      9883
    1996  58 12 "854140"  14.585   .121 "Belgium-Luxembourg"   "Algeria"      1997
    1996 724 12 "854250"  17.492    .25 "Spain"                "Algeria"     18645
    1996 458 12 "854230"  35.667   .132 "Malaysia"             "Algeria"     11879
    1996 757 12 "854140" 131.027  5.113 "Switzerland"          "Algeria"     19434
    1996  40 12 "854219"   7.485   .012 "Austria"              "Algeria"      1090
    1996 381 12 "854130"   92.99   .359 "Italy"                "Algeria"      9883
    1996 156 12 "854129"  24.779   .292 "China"                "Algeria"      3825
    1996 724 12 "854140" 524.997 18.755 "Spain"                "Algeria"     18645
    1996 764 12 "854110"   1.448   .027 "Thailand"             "Algeria"     19875
    1996 842 12 "854190"   7.639   .093 "USA"                  "Algeria"     20810
    1996 752 12 "854190"    .768   .009 "Sweden"               "Algeria"     19221
    1996 757 12 "854129"  19.035   .299 "Switzerland"          "Algeria"     19434
    1996 762 12 "854219"   1.842   .292 "Tajikistan"           "Algeria"     19814
    1996 392 12 "854250" 105.024   .171 "Japan"                "Algeria"     10194
    1996 156 12 "854250"   3.227   .156 "China"                "Algeria"         .
    1996 752 12 "854129"   39.23   .234 "Sweden"               "Algeria"     19221
    1996 381 12 "903141"   3.895   .002 "Italy"                "Algeria"      9883
    1996 124 12 "854219"  23.044   .171 "Canada"               "Algeria"      3346
    1996 724 12 "854121"   8.989   .203 "Spain"                "Algeria"     18645
    1996 490 12 "854250"  28.005   .277 "Other Asia, nes"      "Algeria"     14633
    1996 702 12 "854219"  13.804   .472 "Singapore"            "Algeria"     17699
    1996 826 12 "854121"    .829   .003 "United Kingdom"       "Algeria"     21494
    1996 276 12 "854213"  15.947    .01 "Germany"              "Algeria"      7664
    1996 752 12 "854110"  26.195   1.76 "Sweden"               "Algeria"     19221
    1996 757 12 "854250"  34.108   .025 "Switzerland"          "Algeria"     19434
    1996 208 12 "854130"  73.246   .445 "Denmark"              "Algeria"      5702
    1996 381 12 "854160"   2.979   .105 "Italy"                "Algeria"      9883
    end
    label values bilateral trade
    label def trade 1089 "Austria Albania", modify
    label def trade 1090 "Austria Algeria", modify
    label def trade 1996 "Belgium-Luxembourg Albania", modify
    label def trade 1997 "Belgium-Luxembourg Algeria", modify
    label def trade 3346 "Canada Algeria", modify
    label def trade 3825 "China Algeria", modify
    label def trade 4041 "China, Hong Kong SAR Algeria", modify
    label def trade 5213 "Czechia Albania", modify
    label def trade 5702 "Denmark Algeria", modify
    label def trade 6871 "Finland Afghanistan", modify
    label def trade 7140 "France Albania", modify
    label def trade 7141 "France Algeria", modify
    label def trade 7663 "Germany Albania", modify
    label def trade 7664 "Germany Algeria", modify
    label def trade 8036 "Greece Albania", modify
    label def trade 8635 "Hungary Albania", modify
    label def trade 9505 "Ireland Albania", modify
    label def trade 9506 "Ireland Algeria", modify
    label def trade 9882 "Italy Albania", modify
    label def trade 9883 "Italy Algeria", modify
    label def trade 10194 "Japan Algeria", modify
    label def trade 11879 "Malaysia Algeria", modify
    label def trade 12635 "Mexico Afghanistan", modify
    label def trade 13582 "Netherlands Algeria", modify
    label def trade 14345 "Norway Algeria", modify
    label def trade 14633 "Other Asia, nes Algeria", modify
    label def trade 16029 "Rep. of Korea Algeria", modify
    label def trade 17699 "Singapore Algeria", modify
    label def trade 18645 "Spain Algeria", modify
    label def trade 19221 "Sweden Algeria", modify
    label def trade 19433 "Switzerland Albania", modify
    label def trade 19434 "Switzerland Algeria", modify
    label def trade 19729 "TFYR of Macedonia Albania", modify
    label def trade 19814 "Tajikistan Algeria", modify
    label def trade 19875 "Thailand Algeria", modify
    label def trade 20471 "Turkey Albania", modify
    label def trade 20808 "USA Afghanistan", modify
    label def trade 20809 "USA Albania", modify
    label def trade 20810 "USA Algeria", modify
    label def trade 21494 "United Kingdom Algeria", modify

    Thanks in advance for your help,
    Eleonora

  • #2
    Perhaps you have your merge command backwards. You have
    Code:
    merge 1:m year iso3num_o iso3num_d dist using rawdata\Comtrade\Gravity_V202211
    but, as you say, in your trade data there are multiple observations for some combinations of the merge variables because in a given year a trade pair may have more than one product traded.

    So perhaps what you need is
    Code:
    merge m:1 year iso3num_o iso3num_d dist using rawdata\Comtrade\Gravity_V202211
    if your merge variables uniquely identify observations in the gravity dataset.

    Comment


    • #3
      William Lisowski thank you for your reply! I tried that way too but I still get an error:
      merge m:1 year iso3num_o iso3num_d dist using rawdata\Comtrade\Gravity_V202211
      variables year iso3num_o iso3num_d dist do not uniquely identify observations in the using data
      r(459)

      Comment


      • #4
        It seems to me there are three possibilities.

        1. You are failing to take account of one or more additional variables in your data set that, combined with the variables you are trying to use as the merge key, uniquely identify the observations in one or both of your data sets, so you can use merge 1:m or merge m:1 or merge 1:1 with the expanded merge key.

        2. You are really trying to accomplish what joinby, a different command, does: create in the output dataset every possible combination of an observation from the first dataset and an observation from the second dataset, both having the same merge variables.

        3. The data sets you are trying to merge are incorrectly configured or contain data errors that need to be fixed.

        With regard to the third possibility, you tell me in post #3 that the merge variables do not uniquely identify observations in the using data, but you do not explain why that is the case. Is that what you expect of your "gravity data" (whatever that may be)?

        Comment


        • #5
          The variables that my main dataset has in common with the gravity dataset are: year iso3num_o and iso3num_d (I deleted the dist variable from the trade dataset because it was empty)

          Here a sample of the "gravity data" with dataex (the data goes from 1996 to 2020)

          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input int(year iso3num_o iso3num_d dist) byte contig float diplo_disagreement double(gdp_o gdp_d) byte(wto_o wto_d eu_o eu_d)
          1996 4  32 15271 0 1.8770447 .      329671180.288 0 1 0 0
          1996 4  40  4571 0  2.293679 .      236720488.448 0 1 0 1
          1996 4  51  2265 0  1.876729 .         1596968.96 0 0 0 0
          1996 4  56  5420 0  2.762058 .       280791285.76 0 1 0 1
          1996 4  76 13759 0 1.0338335 .       839682621.44 0 1 0 0
          1996 4 100  4043 0  2.503425 .       10110256.128 0 1 0 0
          1996 4 124 10777 0  2.500256 .  626950471.6800001 0 1 0 0
          1996 4 144  3246 0  .2184785 .        13897738.24 0 1 0 0
          1996 4 156  4852 1   .124076 .       856084643.84 0 0 0 0
          1996 4 158  5109 0         . .          292473000 0 0 0 0
          1996 4 170 14220 0   .493075 .        97160110.08 0 1 0 0
          1996 4 178  7085 0  .6705618 .          2540697.6 0 0 0 0
          1996 4 188 14339 0  .9887574 .       11843228.672 0 1 0 0
          1996 4 191  4610 0  2.213011 .        23678013.44 0 0 0 0
          1996 4 203  4712 0  2.570414 .       66775130.112 0 1 0 0
          1996 4 204  7487 0  .5082176 .        2361116.416 0 1 0 0
          1996 4 208  4861 0  2.613074 .      187632402.432 0 1 0 1
          1996 4 214 12721 0 1.1092999 .       18131812.352 0 1 0 0
          1996 4 218 15209 0  .8809356 .        25225871.36 0 1 0 0
          1996 4 222 14202 0 1.0570226 .        10315499.52 0 1 0 0
          1996 4 233  4239 0  2.473138 .        4739181.568 0 0 0 0
          1996 4 246  4259 0  2.677842 . 132099407.87200001 0 1 0 1
          1996 4 250  5594 0  3.057559 .      1614245396.48 0 1 0 1
          1996 4 268  2268 0 2.0435443 . 3094915.5840000003 0 0 0 0
          1996 4 276  4787 0  2.554902 .     2501572558.848 0 1 0 1
          1996 4 278     . .         . .                  . 0 . 0 .
          1996 4 280     . .         . .                  . 0 . 0 .
          1996 4 300  4060 0  2.600846 .      146558599.168 0 1 0 1
          1996 4 320 14161 0  1.218803 .       15781579.776 0 1 0 0
          1996 4 340 14077 0 1.0208087 .        4034037.248 0 1 0 0
          1996 4 344  4574 0         . .      159717228.544 0 1 0 0
          1996 4 348  4371 0  2.650514 .        46448783.36 0 1 0 0
          1996 4 360  5999 0   .321666 .                  . 0 1 0 0
          1996 4 360     . .         . .                  . 0 . 0 .
          1996 4 372  6091 0  2.193463 .        75695882.24 0 1 0 1
          1996 4 376  3199 0   3.76023 . 109160636.41600001 0 1 0 0
          1996 4 380  4941 0  2.601861 .     1309407707.136 0 1 0 1
          1996 4 392  6282 0  2.235877 .      4706187345.92 0 1 0 0
          1996 4 398  1177 0 1.7217416 .       21035358.208 0 0 0 0
          1996 4 404  5206 0  .8895855 .        12045864.96 0 1 0 0
          1996 4 410  5137 0   1.75281 .      603413151.744 0 1 0 0
          1996 4 440  4046 0  2.569684 .        8426599.936 0 0 0 0
          1996 4 458  4836 0  .4599758 . 100851392.51200001 0 1 0 0
          1996 4 458     . .         . .                  . 0 . 0 .
          1996 4 484 13907 0  .7100456 .      397404143.616 0 1 0 0
          1996 4 508  7728 0   .605705 .        3241719.296 0 1 0 0
          1996 4 528  5366 0  2.749204 .       442598686.72 0 1 0 1
          1996 4 540 12022 0         . . 3606968.3200000003 0 0 0 0
          1996 4 554 13464 0 1.8727163 .       69563621.376 0 1 0 0
          1996 4 558 14229 0 1.0592551 .            4308352 0 1 0 0
          1996 4 566  7393 0   .379965 .        34987950.08 0 1 0 0
          1996 4 570 13992 0         . .                  . 0 0 0 0
          1996 4 574 12523 0         . .                  . 0 0 0 0
          1996 4 578  5017 0  2.521991 .      160158302.208 0 1 0 0
          1996 4 586  1086 1   .069106 .       63320170.496 0 1 0 0
          1996 4 586     . .         . .                  . 0 . 0 .
          1996 4 608  5622 0   .544507 .        82848194.56 0 1 0 0
          1996 4 616  4268 0  2.525153 .      157079207.936 0 1 0 0
          1996 4 643  3370 0 1.9578217 .      391721385.984 0 0 0 0
          1996 4 686  8796 0   .756528 .        5065829.888 0 1 0 0
          1996 4 688     . .         . .                  . 0 . 0 .
          1996 4 690  4566 0  .6030646 .         503060.032 0 0 0 0
          1996 4 694  8789 0  .7801207 .         941742.144 0 1 0 0
          1996 4 702  5147 0  .7204279 .       96400965.632 0 1 0 0
          1996 4 703  4516 0  2.520546 .       27821914.112 0 1 0 0
          1996 4 705  4722 0  2.522695 .        21478225.92 0 1 0 0
          1996 4 706  4364 0         . .                  . 0 0 0 0
          1996 4 710  9242 0 1.0233248 .       143731998.72 0 1 0 0
          1996 4 716  7061 0  .4169278 .        8553146.368 0 1 0 0
          1996 4 720     . .         . .                  . 0 . 0 .
          1996 4 724  6295 0  2.518657 .      640998309.888 0 1 0 1
          1996 4 728     . .         . .                  . 0 . 0 .
          1996 4 729     . .         . .                  . 0 . 0 .
          1996 4 732  7704 0         . .                  . 0 0 0 0
          1996 4 736  4219 0   .595384 .                  . 0 0 0 0
          1996 4 740 12673 0 1.1162937 .         860630.912 0 1 0 0
          1996 4 748  7839 0  1.058636 .         1602741.76 0 1 0 0
          1996 4 752  4599 0  2.269248 .      288103923.712 0 1 0 1
          1996 4 756  5165 0         . . 329619341.31200004 0 1 0 0
          1996 4 760  2902 0   .783815 .       13789560.832 0 0 0 0
          1996 4 762   446 1  1.946054 .        1043893.056 0 0 0 0
          1996 4 764  3905 0  .6017286 . 181947629.56800002 0 1 0 0
          1996 4 768  7611 0  .4526975 .        1465448.192 0 1 0 0
          1996 4 772 13303 0         . .                  . 0 0 0 0
          1996 4 776 13619 0         . .         219583.568 0 0 0 0
          1996 4 780 12805 0  .9696379 .        5759537.664 0 1 0 0
          1996 4 784  1684 0  .5869639 .       73571237.888 0 1 0 0
          1996 4 788  5261 0  .4099458 .       19587321.856 0 1 0 0
          1996 4 792  3583 0 1.9338244 . 181475557.37600002 0 1 0 0
          1996 4 795  1041 1  .3615486 .        2379281.664 0 0 0 0
          1996 4 796 12529 0         . .                  . 0 0 0 0
          1996 4 798 12380 0         . .            12335.2 0 0 0 0
          1996 4 800  5368 0  .1630626 .        6044585.472 0 1 0 0
          1996 4 804  3580 0 1.7832513 .       44558077.952 0 0 0 0
          1996 4 807  4204 0  2.289804 .        4422159.872 0 0 0 0
          1996 4 810     . .         . .                  . 0 . 0 .
          1996 4 818  3587 0  .2939352 .       67629719.552 0 1 0 0
          1996 4 826  5720 0  3.302093 .     1304507056.128 0 1 0 1
          1996 4 834  5553 0   .051361 .        6496195.584 0 1 0 0
          1996 4 840 10862 0  4.315104 .      8100199792.64 0 1 0 0
          end
          I do not understand why actually it gives me the error "variables year iso3num_o iso3num_d do not uniquely identify observations in the using data"
          Last edited by Eleonora De Nicolo; 21 Jan 2023, 10:10.

          Comment


          • #6
            I do not understand why actually it gives me the error "variables year iso3num_o iso3num_d do not uniquely identify observations in the using data"
            In the example data in post #5 you have three pairs of duplicates.
            Code:
            1996 4 360 5999 0  .321666 .                  . 0 1 0 0 1
            1996 4 360    . .        . .                  . 0 . 0 . 1
            
            1996 4 458 4836 0 .4599758 . 100851392.51200001 0 1 0 0 1
            1996 4 458    . .        . .                  . 0 . 0 . 1
            
            1996 4 586 1086 1  .069106 .       63320170.496 0 1 0 0 1
            1996 4 586    . .        . .                  . 0 . 0 . 1

            Comment


            • #7
              Thank you so much, I deleted the duplicates and now it works!

              Comment

              Working...
              X