Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problems with merge command: no observations matched

    Hi! I have a cross-sectional time-series dataset on conflicts and would like to add some country-specific information from another dataset.

    My master dataset has 2,713 observations and my using dataset 13,054 observations.

    The variables I intend to use for merging is country number (Number) and year (YEAR). The master dataset has duplicates on these variables (multiple conflicts pr. country and year), whereas the using dataset has unique observations, thus making me believe that merge m:1 is the correct command.


    However, when i do that, no observations are matched:

    Code:
    . merge m:1 Number YEAR using "\\hume\student-u46\birthee\pc\Downloads\WDI CLEAN SORTED.dta"
    (variable location was str28, now str52 to accommodate using data's values)
    (variable YEAR was int, now long to accommodate using data's values)
    (label BOTH already defined)
    
        Result                      Number of obs
        -----------------------------------------
        Not matched                        15,767
            from master                     2,713  (_merge==1)
            from using                     13,054  (_merge==2)
    
        Matched                                 0  (_merge==3)
        -----------------------------------------
    This is despite the fact that a quick manual check of the datasets reveals that there are matching observations on these two variables.

    I am pretty new to merging datasets, so any suggestions are appreciated!

    Birte
    Last edited by Birte Olsen; 04 Feb 2022, 10:46.

  • #2
    Birte:
    welcome to this forum.
    Are you sure that you should not go -append-?
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      If indeed merge is the appropriate command, are you certain that the YEAR variable is coded identically in the two datasets? In your primary dataset YEAR is an int, which is often used for storing 4-digit years. In your using dataset, YEAR is long, which would store up to a 9-digit year, which isn't necessary.

      I'm concerned that in your using dataset, YEAR is actually something like a Stata Internal Format daily date for, say, January 1 2021 or December 31 2021. We see that a lot in financial data presented on Statalist. And perhaps it has been assigned a Stata datetime format like %tdCCYY so that only the year is displayed, hiding from you what has been done.

      Consider
      Code:
      describe YEAR using "\\hume\student-u46\birthee\pc\Downloads\WDI CLEAN SORTED.dta"
      and see something odd like this might be happening. The giveaway is if a datetime format is assigned to the YEAR variable, since that's unnecessary and while possible is rarely seen - the yearly date value 2021 is represented in Stata Internal Format as 2021, so a datetime format isn't needed.

      Comment


      • #4
        Dear William,

        You were indeed correct! It was the storage type that was mismatched; I changed it and the merge worked perfectly!

        Thank you!

        Comment


        • #5
          dear birte....i am facing the same problem... kindly inform me about the storage type change...i mean how to change the storage type....thanks

          Comment


          • #6
            even after changing the storage type i am unable to merge the two datasets.....
            error that i come across is matched......0 (merge == 3)

            Comment


            • #7
              Descriptions of data are well-meant but insufficient to help those who want to help you. Even the best descriptions of data are no substitute for an actual example of the data. There are many ways your data might be organized that are consistent with your description, and each would require a somewhat different approach. In order to get a helpful response, you need to show some example data.

              Be sure to use the dataex command to do this. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, dataex is already part of your official Stata installation. If not, run ssc install dataex to get it. Either way, run help dataex and read the simple instructions for using it. dataex will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

              When asking for help with code, always show example data. When showing example data, always use the dataex command.

              In the current case, please read the output of
              Code:
              help dataex
              to learn how to use the command, and then run
              Code:
              dataex Number YEAR, count(25)
              in each of your two datasets and post the results here.

              Comment

              Working...
              X