Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging many to one

    Hello there I'm using merge - the 1:m command. Procedureno is the unique number, BUT there are duplicates in the Hospital dataset.
    So I would like to merge to the proceduredata set and compare the dates. Bascially keep the procedureidno observations with the closest date to the opdate.

    Q1, As you can see both datasets all have unique variables except: gender and procedureidno - so why does it says Matched(3) - should it not say Matched(2) - as these are the only common variables.

    Q2. I also have unique string variables in the hospitaldataset which are not shown here labelled 'implantdetail'. The 'implantdetail' is not present in the proceduredataset. I assume this won't create a problem whilst merging but the dataset will just create a column with the string variable?



    Here are the datasets

    Hospital Dataset

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double procedureno long gender2 float opdate
    220 1 22271
    220 1 22273
    221 2 22281
    221 2 22282
    221 2 22282
    223 1 22312
    end
    format %td opdate
    label values gender2 gender2
    label def gender2 1 "F", modify
    label def gender2 2 "M", modify


    Procedure dataset - this is the master dataset

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double procedureno long gender2 float date
    220 1 22271
    221 2 22281
    223 1 22312
    end
    format %td date
    label values gender2 gender2
    label def gender2 1 "F", modify
    label def gender2 2 "M", modify

    Code used:
    merge 1:m procedureno using "sourceoffile.dta"

    Merged dataset
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double procedureno long gender2 float(date opdate) byte _merge
    220 1 22271 22271 3
    221 2 22281 22281 3
    223 1 22312 22312 3
    220 1 22271 22273 3
    221 2 22281 22282 3
    221 2 22281 22282 3
    end
    format %td date
    format %td opdate
    label values gender2 gender2
    label def gender2 1 "F", modify
    label def gender2 2 "M", modify
    label values _merge _merge
    label def _merge 3 "Matched (3)", modify

  • #2
    The _merge variable is categorical and the labels just define the categories. A value of 3 represents matched observations.

    Numeric Equivalent
    code word (results) Description
    -------------------------------------------------------------------
    1 master observation appeared in master only
    2 using observation appeared in using only
    3 match observation appeared in both

    4 match_update observation appeared in both,
    missing values updated
    5 match_conflict observation appeared in both,
    conflicting nonmissing values
    -------------------------------------------------------------------

    See

    Code:
    help merge

    Comment

    Working...
    X