Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging withOUT dropping duplicates

    Hi,

    I am having some trouble merging two data sets because the duplicates (unique identifier variable, HHID) contain different information across each row. I have attached screenshots for a clearer picture

    Click image for larger version

Name:	Screen Shot 2018-07-03 at 8.18.21 PM.png
Views:	1
Size:	250.3 KB
ID:	1451757 Click image for larger version

Name:	Screen Shot 2018-07-03 at 8.17.52 PM.png
Views:	1
Size:	350.1 KB
ID:	1451756

    Thank you for any advice!

  • #2
    Please read the FAQ, with special attention to #12 for excellent advice on posting questions so as to maximize your chances of a helpful and timely response. There you will learn, among other things, that screen shots are deprecated because a) they are often unreadable (as is the case here, at least on my computer) and b) even when readable they are an incomplete representation of the data and often omit details that are important, and c) if somebody who wants to help you needs to try out and test some code, there is no way to import data from a screenshot into Stata.

    So please post back using the -dataex- command to show your example. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Comment


    • #3
      Hi Clyde,

      Thank you for the helpful feedback. I have used dataex and here are the two sets of data that I am trying to merge. The identifier variable is HHID and it is duplicated for both datasets and will not merge. There are also missing data for some of the variables in both data sets, should those be deleted before or after merging? Sorry for the questions, this is my first time analyzing survey data. Thank you.

      input str14 HHID byte a4aq1 str20 a4aq5 byte a4aq7 double(a4aq8 a4aq9) byte a4aq10
      "1013000204" . "" . . . .
      "1021000108" 1 "Cassava" 2 0 . 2
      "1021000108" 1 "Banana food" 2 .5 . 2
      "1021000113" 1 "Maize" 2 .25 5 2
      "1021000113" 1 "Beans" 2 .25 10 1
      "1021000113" . "Dodo" 2 . . .
      "1021000113" 1 "Yam" 2 .25 15 1
      "1021000408" 1 "Banana food" 1 .12999999523162842 . 2
      "1021000408" 1 "Maize" 2 .3700000047683716 70 1
      "1021000408" 1 "Beans" 2 .3700000047683716 30 1
      "1021000710" 1 "Maize" 2 .25 20 1
      "1021000710" 1 "Beans" 2 .25 20 1
      "1021000710" 1 "Banana food" 2 .25 60 2
      "1021000807" 1 "Coffee all" 1 2 . 2
      "1021000807" 1 "Natural pastures" 1 2 . 2
      "1021000807" 1 "Banana food" 1 1 . 2
      "1021000807" 1 "Cassava" 1 1.5 . 2
      "102100080803" 1 "Banana food" 2 2 40 2
      "102100080803" 1 "Coffee all" 2 2 60 2
      "102100080803" 1 "Maize" 2 .25 70 2
      "102100080803" 1 "Beans" 2 .25 30 2
      "102100080803" 1 "Maize" 2 .25 40 2
      "102100080803" 1 "Groundnuts" 2 .25 60 2
      "102100080803" 1 "Maize" 2 1 55 2
      "102100080803" 1 "Cassava" 2 1 45 2
      "102100080803" 1 "Sweet potatoes" 1 .25 . 2
      "102100080803" . "" . . . .
      "102100110201" . "" . . . .
      "102100110201" . "" . . . .
      "102100110201" 1 "Groundnuts" 2 .5 40 2
      "102100110201" 1 "Coffee all" 2 .5 40 2
      "102100110201" 1 "Cocoa" 2 .5 20 2
      "102100110201" 1 "Groundnuts" 1 .5 . 2
      "102100110201" 1 "Beans" 2 .5 75 1
      "102100110201" 1 "Cassava" 2 .5 25 2
      "102100110201" . "" . . . .
      "1021001109" 1 "Sweet potatoes" 1 .15000000596046448 . 2
      "1021001109" 1 "Cassava" 2 .25 40 2
      "1021001109" 1 "Banana food" 2 .25 60 2
      "1021001109" 1 "Maize" 1 .6000000238418579 . 1
      "1021001109" 1 "Yam" 1 2.25 . 2
      "1021001109" 1 "Sugarcane" 1 .25 . 2
      "1021001109" . "" . . . .
      "1021001304" 1 "Maize" 2 .05000000074505806 20 1
      "1021001304" 1 "Beans" 2 .05000000074505806 20 1
      "1021001304" 1 "Cassava" 2 .05000000074505806 40 2
      "1021001304" 1 "Banana food" 2 .05000000074505806 20 2
      "1021001910" 1 "Banana food" 1 .25 . 2
      "1021002501" . "" . . . .
      "1021002610" 1 "Sweet potatoes" 2 .25 50 2
      "1021002610" 1 "Banana food" 2 .25 30 2
      "1021002610" 1 "Coffee all" 2 .25 20 2
      "1021002610" 1 "Cassava" 2 .5 50 2
      "1021002610" 1 "Yam" 2 .5 50 2
      "1021002610" 1 "Maize" 2 .25 30 2
      "1021002610" 1 "Groundnuts" 2 .25 70 2
      "1021002610" 1 "Sweet potatoes" 1 .5 . 2
      "1021002611" 1 "Maize" 1 1 . 1
      "1021002810" 1 "Maize" 2 .20000000298023224 60 1
      "1021002810" 1 "Beans" 2 .20000000298023224 40 1
      "1021003309" . "" . . . .
      "1033000301" 1 "Maize" 2 1.5 80 2
      "1033000301" 1 "Banana food" 2 .5 20 2
      "1033000302" 1 "Beans" 2 .12999999523162842 20 2
      "1033000302" 1 "Cassava" 2 .12999999523162842 20 2
      "1033000302" 1 "Beans" 2 .25 20 2
      "1033000302" 1 "Banana food" 2 .25 20 2
      "1033000302" 1 "Coffee all" 2 .25 20 2
      "1033000302" . "" . . . .
      "1033000303" 1 "Sweet potatoes" 1 .25 . 2
      "1033000303" 1 "Beans" 2 .5 10 1
      "1033000303" 1 "Cassava" 2 .5 10 2
      "1033000303" 1 "Maize" 1 .5 . 1
      "1033000303" 1 "Banana food" 1 1 . 2
      "1033000303" . "" . . . .
      "1033000303" . "" . . . .
      "1033000304" 1 "Banana food" 2 .5 60 2
      "1033000304" 1 "Coffee all" 2 .5 40 2
      "1033000304" 1 "Banana beer" 1 5.5 . 2
      "1033000304" 1 "Sweet potatoes" 1 .5 . 2
      "1033000304" 1 "Maize" 1 .5 . 2
      "1033000304" 1 "Maize" 2 1.5 20 1
      "1033000304" 1 "Beans" 2 1.5 80 1
      "103300030403" 1 "Banana food" 2 1 30 2
      "103300030403" 1 "Coffee all" 2 1 70 2
      "103300030403" 1 "Banana food" 1 .25 . 2
      "103300030403" 1 "Maize" 2 3 80 2
      "103300030403" 1 "Beans" 2 3 10 1
      "103300030403" 1 "Cassava" 2 3 10 2
      "1033000305" . "" . . . .
      "1033000305" . "" . . . .
      "1033000305" . "" . . . .
      "1033000305" . "" . . . .
      "1033000307" 1 "Banana beer" 1 100 . 2
      "1033000307" 1 "Beans" 2 .5 25 1
      "1033000307" 1 "Banana beer" 2 1.5 75 2
      "1033000307" 1 "Cassava" 2 .25 13 2
      "1033000307" 1 "Banana food" 2 1.75 88 2
      "1033000307" . "" . . . .
      "1033000307" . "" . . . .
      end
      label values a4aq1 A4AQ1
      label def A4AQ1 1 "Yes", modify
      label values a4aq7 A4AQ7
      label def A4AQ7 1 "Pure Stand", modify
      label def A4AQ7 2 "Inter cropped", modify
      label values a4aq10 A4AQ10
      label def A4AQ10 1 "Yes", modify
      label def A4AQ10 2 "No", modify
      [/CODE]


      input str14 HHID str20 a5aq4 double a5aq6a int(a5aq6b a5aq6c) double a5aq7a
      "1013000204" "" . . . .
      "1021000108" "Banana food" 48 20 68 0
      "1021000108" "Cassava" 3 20 13 0
      "1021000113" "Beans" 3 24 22 0
      "1021000113" "Dodo" 2 20 22 0
      "1021000113" "Dodo" 10 20 13 0
      "1021000113" "Maize" 1 22 22 0
      "1021000113" "Yam" 99999 . . .
      "1021000408" "Banana food" 99999 . . .
      "1021000408" "Beans" 2 35 22 0
      "1021000408" "Maize" 1 12 13 0
      "1021000710" "Banana food" 99999 . . .
      "1021000710" "Beans" 3.5 24 1 0
      "1021000710" "Maize" 60 13 1 0
      "1021000807" "Coffee all" 7 45 13 7
      "1021000807" "" . . . .
      "1021000807" "Banana food" 180 20 68 80
      "1021000807" "Banana food" 30 20 69 0
      "1021000807" "Banana food" 90 20 67 90
      "1021000807" "Cassava" 17 20 13 17
      "102100080803" "Banana food" 20 20 68 0
      "102100080803" "Coffee all" 0 . . .
      "102100080803" "Beans" 1 45 20 0
      "102100080803" "Maize" 1 45 10 1
      "102100080803" "Groundnuts" 2 42 13 0
      "102100080803" "Maize" 1 45 12 1
      "102100080803" "Cassava" 2 20 9 0
      "102100080803" "Maize" 2.5 45 10 1
      "102100080803" "Sweet potatoes" 1 20 9 0
      "102100080803" "" . . . .
      "102100110201" "Beans" 1 35 10 0
      "102100110201" "" . . . .
      "102100110201" "Cocoa" 4 24 1 4
      "102100110201" "Coffee all" 1 24 21 1
      "102100110201" "Groundnuts" 3 22 12 0
      "102100110201" "Groundnuts" 4 22 13 2
      "102100110201" "Beans" 1 24 37 0
      "102100110201" "Cassava" 99999 . . .
      "102100110201" "" . . . .
      "1021001109" "Sweet potatoes" 32 20 22 0
      "1021001109" "Banana food" 12 20 67 0
      "1021001109" "Cassava" 2 . 10 0
      "1021001109" "Maize" .5 21 9 0
      "1021001109" "Yam" 10 20 9 6
      "1021001109" "Sugarcane" 120 . 85 0
      "1021001109" "" . . . .
      "1021001304" "Banana food" 4 20 68 0
      "1021001304" "Beans" 4 45 32 0
      "1021001304" "Beans" 10 24 66 0
      "1021001304" "Cassava" 5 20 22 0
      "1021001304" "Maize" 1 24 22 0
      "1021001910" "Banana food" 99999 . . .
      "1021002501" "" . . . .
      "1021002610" "Banana food" 10 20 68 0
      "1021002610" "Coffee all" 99999 . . .
      "1021002610" "Sweet potatoes" 8 20 20 3
      "1021002610" "Cassava" 99999 . . .
      "1021002610" "Yam" 5 20 20 0
      "1021002610" "Groundnuts" 1 45 20 0
      "1021002610" "Maize" 1 45 20 0
      "1021002610" "Sweet potatoes" 2 20 20 0
      "1021002611" "Maize" 1 40 10 0
      "1021002611" "Maize" 2 45 10 2
      "1021002810" "Beans" .5 45 22 0
      "1021002810" "Maize" 2 45 22 0
      "1021003309" "" . . . .
      "1033000301" "Banana food" 60 742 69 0
      "1033000301" "Maize" 15 12 . 0
      "1033000302" "Beans" 3 35 1 1
      "1033000302" "Cassava" 0 . . .
      "1033000302" "Banana food" 30 742 69 20
      "1033000302" "Beans" 1 35 1 0
      "1033000302" "Coffee all" 0 . . .
      "1033000302" "Fallow" 99999 . . .
      "1033000303" "Sweet potatoes" 0 . . .
      "1033000303" "Beans" 0 . . .
      "1033000303" "Cassava" 0 . . .
      "1033000303" "Maize" 3 32 14 2
      "1033000303" "Banana food" 50 742 68 15
      "1033000303" "" . . . .
      "1033000303" "Fallow" 99999 . . .
      "1033000304" "Banana food" 40 20 68 0
      "1033000304" "Coffee all" 0 . . .
      "1033000304" "Banana beer" 30 20 68 0
      "1033000304" "Sweet potatoes" 0 . . .
      "1033000304" "Maize" 0 . . .
      "1033000304" "Beans" 1 35 10 0
      "1033000304" "Maize" 20 33 1 0
      "103300030403" "Banana food" 5 20 69 0
      "103300030403" "Coffee all" 2 44 10 2
      "103300030403" "Banana food" 6 20 69 0
      "103300030403" "Beans" 1 45 20 0
      "103300030403" "Cassava" 0 . . .
      "103300030403" "Maize" 7.5 45 10 7
      "1033000305" "" . . . .
      "1033000305" "" . . . .
      "1033000305" "" . . . .
      "1033000305" "" . . . .
      "1033000307" "Banana beer" 30 20 68 0
      "1033000307" "Banana beer" 10 20 69 0
      end
      [/CODE]
      ---------

      Comment


      • #4
        These data sets are not mergeable. There are no variables common to both data sets that uniquely identify observations in either one. So there is no way to decide which observation in data set 2 should be paired with which item in data set 1. For example, in data set 1, the 6th observation is about HHID 1021000113 and a4aq5 is Dodo. Now look at observations 5 and 6 in data set 2: both of them also refer to this same HHID and a5aq4 (which appears to be the same thing as a4aq5 in data set 1) is also Dodo. So should observation 6 from data set 1 be paired with observation 5 or observation 6 from data set 2? There is no way to decide from anything in those data sets. There are many such situations in this pair of data sets, often involving a large number of possible matches between observations with the same HHID and a4aq5/a5aq4.

        Possible resolutions of the problem:

        1. Perhaps you want to match up all possible pairs of observations having the same HHID and a4aq5/a5aq4. In that case, rename the variables a4aq5 and a5aq4 in each data set to some common name and then use -joinby- instead of -merge-. (Actually, you need to do the -rename-ing in any case: there is no hope of combining these data sets until you do that much.)

        2. Perhaps if other variables were included in these data sets there would be some combination of variables that would serve to uniquely identify observations in one of them. So perhaps you need to -merge- one or both of these data sets with something else first in order to include that variable that, in combination with HHID and a4aq5/a5aq4, will uniquely identify observations in one or both data sets.

        As an aside, I strongly recommend that you rename all of these a#a# variables to mnemonic names. Even if you can, at the moment, keep straight in our head what each of them refers to, you probably won't be able to within a few days of finishing work on it, and almost certainly won't be able to several months from now. If you have to go back to this work later on when somebody has questions about what you did, you will save yourself a great deal of time if your variable names remind you f what they contain.

        Added: It dawns on me that perhaps you want to pair up every observation for a given HHID in the first data set with every observation on the same HHID in the second data set even if they don't agree on anything else. In that case
        Code:
        use dataset1
        joinby HHID using dataset2
        (Do read -help joinby-, with particular attention to the -unmatched()- option to specify what you want to do with any observations that have no matches in the other data set.)
        Last edited by Clyde Schechter; 03 Jul 2018, 17:46.

        Comment


        • #5
          Hi Clyde,

          Thank you for the feedback! I tried joining another two sets of datasets by renaming and it worked! What about this dataset, they have the same information but I tried renaming the key identifier variables and merging but it did not work.

          Example generated by -dataex-. To install: ssc install dataex
          clear
          input str14 HHID float Parcel_ID int Plot_ID str20 Crop_name int Crop_code double a5aq6a int(a5aq6b a5aq6c) double(a5aq6d a5aq7a) byte a5aq7b int a5aq7c double a5aq8 byte a5aq9 long a5aq10
          "1013000204" 21 . "" . . . . . . . . . . .
          "1021000108" 1 1 "Banana food" 741 48 20 68 12 0 . . . . .
          "1021000108" 1 1 "Cassava" 630 3 20 13 80 0 . . . . .
          "1021000113" 1 1 "Beans" 210 3 24 22 10 0 . . . . .
          "1021000113" 1 1 "Dodo" 460 2 20 22 30 0 . . . . .
          "1021000113" 1 1 "Dodo" 460 10 20 13 35 0 . . . . .
          "1021000113" 1 1 "Maize" 130 1 22 22 30 0 . . . . .
          "1021000113" 1 1 "Yam" 640 99999 . . . . . . . . .
          "1021000408" 1 1 "Banana food" 741 99999 . . . . . . . . .
          "1021000408" 1 2 "Beans" 210 2 35 22 14 0 . . . . .
          "1021000408" 1 2 "Maize" 130 1 12 13 80 0 . . . . .
          "1021000710" 1 1 "Banana food" 741 99999 . . . . . . . . .
          "1021000710" 1 1 "Beans" 210 3.5 24 1 1 0 . . . . .
          "1021000710" 1 1 "Maize" 130 60 13 1 1 0 . . . . .
          "1021000807" 1 1 "Coffee all" 810 7 45 13 70 7 45 13 637000 6 0
          "1021000807" 1 2 "" 910 . . . . . . . . . .
          "1021000807" 2 1 "Banana food" 741 180 20 68 16 80 20 68 200000 6 0
          "1021000807" 2 1 "Banana food" 741 30 20 69 3 0 . . . . .
          "1021000807" 2 1 "Banana food" 741 90 20 67 25 90 20 67 360000 6 0
          "1021000807" 2 2 "Cassava" 630 17 20 13 120 17 20 13 1900 6 0
          "102100080803" 1 1 "Banana food" 741 20 20 68 15 0 . . . . .
          "102100080803" 1 1 "Coffee all" 810 0 . . . . . . . . .
          "102100080803" 2 1 "Beans" 210 1 45 20 20 0 . . . . .
          "102100080803" 2 1 "Maize" 130 1 45 10 100 1 45 10 50000 2 0
          "102100080803" 2 2 "Groundnuts" 310 2 42 13 30 0 . . . . .
          "102100080803" 2 2 "Maize" 130 1 45 12 50 1 45 12 25000 2 0
          "102100080803" 21 1 "Cassava" 630 2 20 9 120 0 . . . . .
          "102100080803" 21 1 "Maize" 130 2.5 45 10 100 1 45 10 50000 2 0
          "102100080803" 21 2 "Sweet potatoes" 620 1 20 9 120 0 . . . . .
          "102100080803" 21 3 "" . . . . . . . . . . .
          "102100110201" 1 2 "Beans" 210 1 35 10 100 0 . . . . .
          "102100110201" 1 . "" . . . . . . . . . . .
          "102100110201" 2 1 "Cocoa" 820 4 24 1 1 4 45 1 12000 2 0
          "102100110201" 2 1 "Coffee all" 810 1 24 21 80 1 44 11 60000 2 0
          "102100110201" 2 1 "Groundnuts" 310 3 22 12 35 0 . . . . .
          "102100110201" 3 1 "Groundnuts" 310 4 22 13 35 2 42 11 120000 2 0
          "102100110201" 3 2 "Beans" 210 1 24 37 20 0 . . . . .
          "102100110201" 3 2 "Cassava" 630 99999 . . . . . . . . .
          "102100110201" 4 . "" . . . . . . . . . . .
          "1021001109" 1 1 "Sweet potatoes" 620 32 20 22 15 0 . . . . .
          "1021001109" 1 2 "Banana food" 741 12 20 67 20 0 . . . . .
          "1021001109" 1 2 "Cassava" 630 2 . 10 100 0 . . . . .
          "1021001109" 1 3 "Maize" 130 .5 21 9 100 0 . . . . .
          "1021001109" 2 1 "Yam" 640 10 20 9 . 6 20 9 300000 2 .
          "1021001109" 2 2 "Sugarcane" 510 120 . 85 . 0 . . . . .
          "1021001109" 3 . "" . . . . . . . . . . .
          "1021001304" 1 1 "Banana food" 741 4 20 68 15 0 . . . . .
          "1021001304" 1 1 "Beans" 210 4 45 32 .5 0 . . . . .
          "1021001304" 1 1 "Beans" 210 10 24 66 3 0 . . . . .
          "1021001304" 1 1 "Cassava" 630 5 20 22 15 0 . . . . .
          "1021001304" 1 1 "Maize" 130 1 24 22 15 0 . . . . .
          "1021001910" 1 1 "Banana food" 741 99999 . . . . . . . . .
          "1021002501" 1 1 "" . . . . . . . . . . .
          "1021002610" 1 1 "Banana food" 741 10 20 68 15 0 . . . . .
          "1021002610" 1 1 "Coffee all" 810 99999 . . . . . . . . .
          "1021002610" 1 1 "Sweet potatoes" 620 8 20 20 25 3 20 20 21000 2 0
          "1021002610" 1 2 "Cassava" 630 99999 . . . . . . . . .
          "1021002610" 1 2 "Yam" 640 5 20 20 25 0 . . . . .
          "1021002610" 1 3 "Groundnuts" 310 1 45 20 16 0 . . . . .
          "1021002610" 1 3 "Maize" 130 1 45 20 16 0 . . . . .
          "1021002610" 21 1 "Sweet potatoes" 620 2 20 20 25 0 . . . . .
          "1021002611" 1 1 "Maize" 130 1 40 10 130 0 . . . . .
          "1021002611" 1 1 "Maize" 130 2 45 10 100 2 45 10 150000 2 0
          "1021002810" 1 1 "Beans" 210 .5 45 22 15 0 . . . . .
          "1021002810" 1 1 "Maize" 130 2 45 22 15 0 . . . . .
          "1021003309" 21 . "" . . . . . . . . . . .
          "1033000301" 1 1 "Banana food" 741 60 742 69 6 0 . . . . .
          "1033000301" 1 1 "Maize" 130 15 12 . 225 0 . . . . .
          "1033000302" 1 1 "Beans" 210 3 35 1 20 1 45 1 20000 3 0
          "1033000302" 1 1 "Cassava" 630 0 . . . . . . . . .
          "1033000302" 1 2 "Banana food" 741 30 742 69 7 20 20 69 60000 3 0
          "1033000302" 1 2 "Beans" 210 1 35 1 20 0 . . . . .
          "1033000302" 1 2 "Coffee all" 810 0 . . . . . . . . .
          "1033000302" 1 3 "Fallow" 930 99999 . . . . . . . . .
          "1033000303" 1 1 "Sweet potatoes" 620 0 . . . . . . . . .
          "1033000303" 1 2 "Beans" 210 0 . . . . . . . . .
          "1033000303" 1 2 "Cassava" 630 0 . . . . . . . . .
          "1033000303" 1 3 "Maize" 130 3 32 14 20 2 32 14 12000 3 0
          "1033000303" 1 4 "Banana food" 741 50 742 68 7 15 . 68 15500 5 0
          "1033000303" 2 . "" . . . . . . . . . . .
          "1033000303" . 5 "Fallow" 930 99999 . . . . . . . . .
          "1033000304" 1 1 "Banana food" 741 40 20 68 200 0 . . . . .
          "1033000304" 1 1 "Coffee all" 810 0 . . . . . . . . .
          "1033000304" 1 2 "Banana beer" 742 30 20 68 120 0 . . . . .
          "1033000304" 21 1 "Sweet potatoes" 620 0 . . . . . . . . .
          "1033000304" 22 1 "Maize" 130 0 . . . . . . . . .
          "1033000304" 22 2 "Beans" 210 1 35 10 100 0 . . . . .
          "1033000304" 22 2 "Maize" 130 20 33 1 20 0 . . . . .
          "103300030403" 1 1 "Banana food" 741 5 20 69 5 0 . . . . .
          "103300030403" 1 1 "Coffee all" 810 2 44 10 100 2 44 10 70000 2 0
          "103300030403" 2 1 "Banana food" 741 6 20 69 5 0 . . . . .
          "103300030403" 21 1 "Beans" 210 1 45 20 20 0 . . . . .
          "103300030403" 21 1 "Cassava" 630 0 . . . . . . . . .
          "103300030403" 21 1 "Maize" 130 7.5 45 10 100 7 45 10 540000 2 0
          "1033000305" 1 . "" . . . . . . . . . . .
          "1033000305" 2 . "" . . . . . . . . . . .
          "1033000305" 3 . "" . . . . . . . . . . .
          "1033000305" 21 . "" . . . . . . . . . . .
          "1033000307" 1 1 "Banana beer" 742 30 20 68 5 0 . . . . .
          "1033000307" 2 1 "Banana beer" 742 10 20 69 5 0 . . . . .
          end
          label values a5aq9 A5AQ9
          label def A5AQ9 2 "Private trader in Local village/market", modify
          label def A5AQ9 3 "Private trader in district market", modify
          label def A5AQ9 5 "Neighbor/ Relative", modify
          label def A5AQ9 6 "Other (specify)", modify
          [/CODE]


          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input str14 HHID float Parcel_ID long Plot_ID str20 Crop_name long Crop_code double a5bq6a int a5bq6b long a5bq6c double(a5bq6d a5bq7a) int a5bq7b float a5bq7c long(a5bq8 a5bq9) double a5bq10
          "1013000204"   21 1 "BANANA"         741     0  .  .   .   .  .  .      . .    .
          "1013000204"   21 1 "Cassava"        630    .5 10 11  80  .5 10 11  20000 5    0
          "1013000204"   21 1 "MAIZE"          130     1 20 39   5   0  .  .      . .    .
          "1013000204"   21 1 "cassava"        630     1 10 11  80  .5 10 11  20000 5    0
          "1021000108"    1 1 "Banana food"    741    36 20 68  15   0  .  .      . .    .
          "1021000108"    1 2 "Beans"          210    24 45 13  45   0  .  .      . .    .
          "1021000108"    1 2 "Maize"          130     1 20 13  70   0  .  .      . .    .
          "1021000108"    1 3 ""                 .     .  .  .   .   .  .  .      . .    .
          "1021000108"    1 3 "Fallow"         930     .  .  .   .   .  .  .      . .    .
          "1021000113"    1 1 "Cabbage"        410     6 20 22  20   3 20 22   7000 6 3000
          "1021000113"    1 2 "Other"          890     8 20 22   6   5 20 22  15000 6 2000
          "1021000113"    1 3 "Dodo"           460    12 20 22   6   5 20 22  10000 6 2000
          "1021000113"    1 4 "Dodo"           460    12 20 22   6   4 20 22   8000 6 2000
          "1021000113"    1 5 "Yam"            640     3 20 22  18   0  .  .      . .    .
          "1021000408"    1 . ""                 .     .  .  .   .   .  .  .      . .    .
          "1021000710"    1 1 "Banana food"    741    40 20 68  23   0  .  .      . .    .
          "1021000807"    1 1 "Coffee all"     810     4 45 13  70   4 45 13 364000 6    0
          "1021000807"    1 2 ""               910     .  .  .   .   .  .  .      . .    .
          "1021000807"    2 1 "Banana food"    741    65 20 68  16   0  .  .      . .    .
          "1021000807"    2 1 "Banana food"    741    30 20 67  25  30 20 67 135000 6    0
          "1021000807"    2 2 ""               630     .  .  .   .   .  .  .      . .    .
          "1021000807"    . 2 ""                25     .  .  .   .   .  .  .      . .    .
          "102100080803"  1 1 "Banana food"    741    30 20 68  17   0  .  .      . .    .
          "102100080803"  1 1 "Coffee all"     810     2 45 13  75   2 45 13 120000 2    0
          "102100080803"  1 2 ""                 .     .  .  .   .   .  .  .      . .    .
          "102100080803"  2 1 "Beans"          210    15 45  1   1   0  .  .      . .    .
          "102100080803"  2 1 "Maize"          130     1 45 12  50   0  .  .      . .    .
          "102100080803"  2 2 "Groundnuts"     310     2 42 13  35   0  .  .      . .    .
          "102100080803"  2 2 "Maize"          130     1 45 13  75  25 45  1   6250 2    0
          "102100080803" 21 1 ""                 .     .  .  .   .   .  .  .      . .    .
          "102100080803" 21 1 "Maize"          130     1 45 10 100  .5 45  1  12500 2    0
          "102100110201"  1 . ""                 .     .  .  .   .   .  .  .      . .    .
          "102100110201"  2 1 ""               820     .  .  .   .   .  .  .      . .    .
          "102100110201"  2 1 "Coffee all"     810     2 24 11  80   2 44 11 120000 2    0
          "102100110201"  2 2 "Cocoa"          820 99999  .  .   .   .  .  .      . .    .
          "102100110201"  3 1 "Cassava"        630     6 20 12  50   6 20 12  60000 2    0
          "102100110201"  3 1 "Cassava"        630     2 20 10 100   .  .  .      . .    0
          "102100110201"  4 . ""                 .     .  .  .   .   .  .  .      . .    .
          "102100110201"  . 2 ""                38     .  .  .   .   .  .  .      . .    .
          "1021001109"    1 . ""                 .     .  .  .   .   .  .  .      . .    .
          "1021001109"    2 1 "Yam"            640   2.5 20 10 100   0  .  .      . .    .
          "1021001109"    3 1 ""                 .     .  .  .   .   .  .  .      . .    .
          "1021001109"    3 1 "Banana food"    741    20 20 67  15   0  .  .      . .    .
          "1021001109"    3 2 ""                 .     .  .  .   .   .  .  .      . .    .
          "1021001109"    3 2 "Cassava"        630     2 20 10 120   0  .  .      . .    .
          "1021001109"    3 3 ""                 .     .  .  .   .   .  .  .      . .    .
          "1021001109"    3 3 "Sweet potatoes" 620     2 20 10 100   0  .  .      . .    .
          "1021001109"    3 4 ""                 .     .  .  .   .   .  .  .      . .    .
          "1021001109"    3 4 "Maize"          130     1 21 10 100   0  .  .      . .    .
          "1021001304"    1 1 "Banana food"    741     7 20 68  20   0  .  .      . .    .
          "1021001304"    1 1 "Beans"          210     1 45 20  25   0  .  .      . .    .
          "1021001304"    1 1 "Beans"          210    20 24 32  .5   0  .  .      . .    .
          "1021001304"    1 1 "Cassava"        630     3 20 22  17   0  .  .      . .    .
          "1021001304"    1 2 "Sweet potatoes" 620    .5 20 10  17   0  .  .      . .    .
          "1021001910"    1 . ""                 .     .  .  .   .   .  .  .      . .    .
          "1021002501"    1 . ""                 .     .  .  .   .   .  .  .      . .    .
          "1021002610"    1 1 "Banana food"    741     5 20 68  15   0  .  .      . .    .
          "1021002610"    1 1 "Coffee all"     810     3 44 20  20   0  .  .      . .    .
          "1021002610"    1 1 "Sweet potatoes" 620 99999  .  .   .   .  .  .      . .    .
          "1021002610"    1 2 "Cassava"        630     1 20 20  25   0  .  .      . .    .
          "1021002610"    1 2 "Yam"            640     3 20 20  25   2 20 20  20000 4    0
          "1021002610"    1 3 "Cassava"        630 99999  .  .   .   .  .  .      . .    .
          "1021002610"   21 1 "Yam"            640 99999  .  .   .   .  .  .      . .    .
          "1021002611"    1 1 "Beans"          210   1.5 23 22  20   0  .  .      . .    .
          "1021002611"    1 1 "Cassava"        630 99999  .  .   .   .  .  .      . .    .
          "1021002611"    1 1 "Sweet potatoes" 620 99999  .  .   .   .  .  .      . .    .
          "1021002810"    1 1 "Sweet potatoes" 620     3 40 22  15   0  .  .      . .    .
          "1021002810"    1 2 "Banana food"    741 99999  .  .   .   .  .  .      . .    .
          "1021003309"   21 . ""                 .     .  .  .   .   .  .  .      . .    .
          "1033000301"    1 1 "Banana food"    741    15 20 69   5   0  .  .      . .    .
          "1033000301"    1 1 "Maize"          130     2 45  9 120   0  .  .      . .    .
          "1033000302"    1 1 "Beans"          210     3 35 20  20   0  .  .      . .    .
          "1033000302"    1 1 "Maize"          130     7 33 10 100 1.5 45 10  27000 2    0
          "1033000302"    1 1 "maize"          130     7 45 10 100 1.5 45 10  27000 2    0
          "1033000302"    1 2 "Coffee all"     810     2 24 10 100   2 24 10  70000 2    0
          "1033000302"    1 2 "banana"         741    24 20 68  12   2 20 68   8000 2    0
          "1033000302"    1 3 "Cassava"        630     2 20 10 100   1 40 10  35000 2    0
          "1033000303"    1 1 "Sweet potatoes" 620     0  .  .   .   .  .  .      . .    .
          "1033000303"    1 2 "Cassava"        630     1 20 10 100   0  .  .      . .    .
          "1033000303"    1 2 "beans"          210     0  .  .   .   .  .  .      . .    .
          "1033000303"    1 3 "Banana food"    741    72 10 68  12   0  .  .      . .    .
          "1033000303"    1 4 "Beans"          210     2 35 20  20   0  .  .      . .    .
          "1033000303"    1 4 "Maize"          130  14.5 33 10 100   3 45 10  90000 2    0
          "1033000303"    1 4 "maize"          130   5.5 33 10 100   3 45 10  90000 2    0
          "1033000303"    2 1 ""                 .     .  .  .   .   .  .  .      . .    .
          "1033000303"    2 2 ""                 .     .  .  .   .   .  .  .      . .    .
          "1033000303"    2 3 ""                 .     .  .  .   .   .  .  .      . .    .
          "1033000303"    2 4 ""                 .     .  .  .   .   .  .  .      . .    .
          "1033000304"    1 1 "Banana food"    741   168 69  5   0   0  .  .      . .    .
          "1033000304"    1 1 "Coffee all"     810    50 20 20  20  50 20 20 200000 2    0
          "1033000304"    1 2 "Banana beer"    742    60 20 69   5   0  .  .      . .    .
          "1033000304"   21 1 "Sweet potatoes" 620     3 20 20  20   0  .  .      . .    .
          "1033000304"   22 . ""                 .     .  .  .   .   .  .  .      . .    .
          "103300030403"  1 1 "Banana food"    741     5 20 69   5   0  .  .      . .    .
          "103300030403"  1 1 "Coffee all"     810     3 20 10 100   3 20 10  90000 .    .
          "103300030403"  2 1 "Banana food"    741     6 20 69   5   0  .  .      . .    .
          "103300030403" 21 1 "Beans"          210     1 20 20  20   0  .  .      . .    .
          "103300030403" 21 1 "Cassava"        630     0  .  .   .   .  .  .      . .    .
          "103300030403" 21 1 "Maize"          130    12 45 10 100   0  .  .      . .    .
          "1033000305"    1 1 ""                 .     .  .  .   .   .  .  .      . .    .
          end
          label values a5bq9 A5BQ9
          label def A5BQ9 2 "Private trader in Local village/market", modify
          label def A5BQ9 4 "Consumer at market", modify
          label def A5BQ9 5 "Neighbor/ Relative", modify
          label def A5BQ9 6 "Other (specify)", modify

          Comment


          • #6
            In what sense did it "not work?" When I save these as dataset1 and dataset2 and then run the code in #4, they join together properly. Please show the actual code you used and the results you obtained, and, if it isn't obvious, explain why it isn't what you wanted.

            Comment


            • #7
              Hi Clyde,

              Sorry, what I meant was I want to merge the two datasets not join them together. Maybe the two datasets cannot be merged?

              Here is the code I used
              use "dataset 1"
              merge m:1 HHID Parcel_ID using " dataset 2"
              (note: variable Plot_ID was int, now long to accommodate using data's values)
              (note: variable Crop_code was int, now long to accommodate using data's values)
              variables HHID Parcel_ID do not uniquely identify observations in the using data
              r(459);

              Comment


              • #8
                No, these data sets cannot be -merge-d, because, as Stata explains, the variables HHID and Parcel_ID do not uniquely identify observations in dataset 2. (Actually, they don't in dataset 1 either.) You can still join them with -joinby HHID Parcel_ID- if you want to pair up every observation of a given HHID Parcel_ID combination in dataset 1 with every observation from dataset 2 that has the same HHID and Parcel_ID. Looking a bit more closely at the data, but not knowing exactly what you are looking to do here, I will guess that you might want to do something with a bit closer pairing such as:

                Code:
                joinby HHID Parcel_ID Plot_ID Crop_code using "dataset 2"
                That will generate a data set of more closely matched observations. But even with these four variables you do not have unique identification of observations in either data set. Are you sure the data sets are correct? Perhaps they are incomplete: is there an additional variable (perhaps a date) that might provide unique identification along with HHID Parcel_ID Plot_ID and Crop_code.

                Comment

                Working...
                X