Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • The update and replace option for merge

    Dear Statalist,

    Often times I have to perform the merge between a 10% subsample of my master file and the master file. We tend to verify 10% of our master data as a way to check whether our data entry were done accurately. As a result, I often have to merge the 10% sample back to the master file if anything were changed/updated in the 10% sample file. I used the update and replace option for the merge command to do this conveniently. Although Stata will produce a nice output of how many case have missing or non-missing values updated (see example below), I was wondering if there is any way that Stata can tell me which case (this can be done by listing the studyid for those who had _merge==5 or _merge==4, but if there are two changes made to a single case then stata will only tell you that changes made to that case but not how many changes were made to the case) and what value were updated (this is the part I needed the most help). What I did previously was creating two datasets (one with updated data and the other was the master file) and added an extension to all variable names in the subsample dataset and merge them all together. By comparing the same set of variables using the foreach command and list with "if" option, I was able to tell which case has what values updated however this is somehow time consuming. I am wondering if there is any other convenient way of doing this in Stata. Any advice/feedback on this would be greatly appreciated.

    Thanks

    Sam


    Result # of obs.
    -----------------------------------------
    not matched 294
    from master 294 (_merge==1)
    from using 0 (_merge==2)
    matched 33
    not updated 26 (_merge==3)
    missing updated 0 (_merge==4)
    nonmissing conflict 7 (_merge==5)
    -----------------------------------------



  • #2
    Sam,

    Check out vmerge (short for verbose merge), available at SSC, and see if that meets your needs. If not, let me know and I may be able to make some improvements.

    Regards,
    Joe

    Comment


    • #3
      Hi Joe,

      This is exactly what I wanted, however, i wasn't sure if I use the command correctly. Maybe you could help me examine my codes below:

      Code:
       vmerge 1:1 StudyID_T3 using "V:\El Valor Data\Data Verification\T3 Customer Data Verification\EV_Cust_Interw_ALL_10pcnt_sample_12-10-15_GW.dta", replace update
      After running the code, I receive an error message like this:

      invalid 'Valor'
      r(198);

      It seems like it didn't like my file path, but I'm pretty sure that my file path is correct. Any help on this is appreciated. Thanks for creating such a convenient wrapper.

      Comment


      • #4
        Sam,

        I think this may be a bug associated with using file paths that have blanks in them. This should work when the path is surrounded by quotes, but apparently doesn't in this case. I need to do some investigation into how Stata passes these file names. When I find a solution I will send you an updated vmerge.ado.

        Regards,
        Joe

        Comment


        • #5
          Sam,

          Actually, it was an easy fix: I failed to include the quotes when passing the file path to Stata's merge command. The revised vmerge,ado is attached. You can copy this to your working directory or to your ado-files directory. If you'd rather wait, I will get this version onto SSC within the next few days.

          Good catch!

          Regards,
          Joe
          Attached Files

          Comment


          • #6
            Hello Joe,

            Thanks so much for the updated ado file. I replaced the new one with the old one and it's working well. Thanks so much for getting this fixed so quickly. This saves me tons of work!!!!
            Thanks again for creating such a convenient and useful wrapper.

            Sam

            Comment

            Working...
            X