Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Joinby / merge question

    Hi All!

    After going through the help files on STATA for joinby and merge I still can't figure out how to do the following:

    For example I have two datasets, which I am trying to combine into one dataset based on the matching variable Date in both; but also want to keep the matched variables in the master dataset (i.e. merge two, but keep only the data where master Date variable matches up with Date variable in using)

    The command I am trying is merge 1:1 Data using XXXX.dta but from what I can tell it keeps the master variables which are matched to using dataset; is there any way around this?

    For reference I will attach both datasets (so trying to join HK to China, but keep only the data with matching Date variable).

    (P.S. I have tried joinby command as well, but that doesn't do anything at all?)

    Thank you in advance!











    Attached Files

  • #2
    When merging, the matching key variable (Date) must have the same name in both datasets, but the other variables typically have different names. I am not certain where you want to go, but with this you lose no information:
    Code:
    use china.dta , clear
    rename Country Country_C
    rename Adj_Close Adj_Close_C
    merge 1:1 Date using HK.dta

    Comment


    • #3
      After merge a variable is left behind called _merge. If type tab _merge you can see how many observations have variables from only the HK dataset, only the china dataset, and from both datasets. If you then type tab _merge, nolabel you can see which actual values on the _merge variable corresponds to which scenario. You can then type keep if _merge == 3 (I happen to know by hart that 3 is the value that corresponds to match from both datasets, but don't trust me on that; check using tab as I don't trust my memory either for such crucial steps and check when I merge).

      Instead of typing tab _merge and tab _merge, nolabel you can also use fre _merge this will give you the counts, value labels and actual values in one command. You can get the user written fre command by typing ssc install fre.

      Alternatively, you can add the keep(match) option to the merge command. I recommend against that. Merging datasets and deleting observations is a crucial step in data preparation that needs to be done very carefully. You need to check and look at the observations that weren't matched to see if something went wrong, and you cannot do that with that option.

      ---------------------------------
      Maarten L. Buis
      University of Konstanz
      Department of history and sociology
      box 40
      78457 Konstanz
      Germany
      http://www.maartenbuis.nl
      ---------------------------------

      Comment


      • #4
        Thank you for the replies! Svend was correct at pointing out that they should all have different var names for the merge to be a success;


        Comment

        Working...
        X