Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • merge issues with assert() and keep() options

    Hi Statalist -

    I'm having a pretty frustrating experience with merge. The keep() and assert() options are operating in a way I really can't get my head around, and clearly not in the way that documentation for merge describes. It seems there is some interaction between these two options that throws things off. It would really be helpful if someone can explain to me what's going on..

    Here is an example you can test out to see the problem.

    * Begin setup code
    clear
    set obs 2
    gen id = 1 if _n == 1
    replace id = 3 if _n == 2
    tempfile master
    save `master'

    clear
    set obs 2
    gen id = 2 if _n == 1
    replace id = 3 if _n == 2
    tempfile using
    save `using'
    * End setup code

    You can see I've created a master dataset with ID values 1 and 3, and a using dataset with ID values 2 and 3. Starting with the master dataset and merging in the using on id, we should see that id==1 is master only, id==2 is using only, and id==3 is matched. From my understanding of the assert() option, only assert(master using match) should work.

    Here are some cases of what actually happens:
    (1) merge 1:1 id using `using', assert(master match)
    --> This fails, and retains in memory id==1, id==2, id==3. This matches my understanding of assert().
    (2) merge 1:1 id using `using', assert(master match) keep(match)
    --> This works, and retains only the id==3 obs. It seems to me this assert should have failed, unless the keep() option executed before assert(). Things start getting weird in the next one..
    (3) merge 1:1 id using `using', assert(match) keep(match)
    --> This fails, and retains in memory id==1 and id==3. Note that this would have worked if in fact keep() was executing before assert(), so now I'm lost for why (2) works.
    (4) merge 1:1 id using `using', assert(using match) keep(match)
    --> This fails, and retains in memory id==1 and id==3. This also appears very hard to reconcile with the fact that (2) works.

    I guess my question is, why does (2) work, and how is Stata processing the assert() and keep() options in a way that is consistent with these four results?

    Thank you!

  • #2
    I guess my question is, why does (2) work, and how is Stata processing the assert() and keep() options in a way that is consistent with these four results?
    Good question. I can't reproduce your problem. On my setup (Stata 15.1 MP2, Windows 7) (2) breaks with the appropriate error message:

    Code:
    * Begin setup code
    clear
    set obs 2
    gen id = 1 if _n == 1
    replace id = 3 if _n == 2
    tempfile master
    save `master'
    
    clear
    set obs 2
    gen id = 2 if _n == 1
    replace id = 3 if _n == 2
    tempfile using
    save `using'
    * End setup code
    
    use `master', clear
    merge 1:1 id using `using', assert(master match) keep(match)
    gives me:
    Code:
    . merge 1:1 id using `using', assert(master match) keep(match)
    merge:  after merge, not all observations from master or matched
            (merged result left in memory)
    r(9);

    Comment


    • #3
      Very, very strange. I double checked and (2) works for me as described above, keeping only id==3 in memory. I even re-wrote the code exactly as you include above. I'm using Stata SE 15.0 on Windows 10 Pro.

      Comment


      • #4
        This is a known bug in version 15.0. It was fixed in the release of 15.1. Just update your Stata installation.

        Comment


        • #5
          Resolved! I updated to 15.1 and that seems to have done the trick. Thank you!

          Comment

          Working...
          X