Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Determine which observations are changed with -merge-, update replace

    Hi,

    I am updating and replace non-missing values in a master data set with non-missing values in the using data set. As an example, let's say I have 10 variables var1-9 with 20 observations and one of the variables, let's call it ID, is unique and non-repeating. I updated and replaced var1, var2, var3 for 3 observations, i.e. 3 of the ID variables. I got the _merge == 5 result I expected and I double-checked the new data set only had these 3 variables replaced for these 3 observations. However, I had to check in Excel. Is there a command to return only the observations that were changed?

    Code:
    list if _merge == 5
    Will return all 10 variable names (and values) for the 3 observations changed. I am just trying to get the output of var1, var2, var3 that were changed with -update, replace-. Could not find anything by searching old threads.

    Thank you!!

  • #2
    Try this:
    Code:
    use master_data_set, clear
    merge 1:1 ID using update_data_set, update replace
    cf _all using master_data_set, verbose

    Comment


    • #3
      Hmm, thanks for the quick response Clyde. Unfortunately, it just returns the number of observations that were changed.

      Code:
      master has 20 obs, using 3
      I am trying confirm which observations were updated and replaced.

      Comment


      • #4
        Unfortunately, it just returns the number of observations that were changed.
        Not true. Did you forget the -verbose- option?
        Code:
        . sysuse auto, clear
        (1978 automobile data)
        
        . replace mpg = 18 in 3
        (1 real change made)
        
        . replace headroom = 3.0 in 42
        (1 real change made)
        
        . save mycf, replace
        file mycf.dta saved
        
        .
        . sysuse auto, clear
        (1978 automobile data)
        
        . cf _all using mycf, verbose
                     mpg: 1 mismatch
                          obs  3. 22 in master; 18 in using
                headroom: 1 mismatch
                          obs 42. 2 in master; 3 in using

        Comment


        • #5
          Hmm odd. Now I get
          Code:
          variable master_data_set not found
          Yes, I used -verbose-

          Comment


          • #6
            Ignore post #5, I forgot to use the term -using- in the code line.

            Regardless, I still get the
            Code:
             
             master has 20 obs, using 3
            with -verbose- option.

            Comment


            • #7
              Maybe you have an older version of Stata? Even in version 17, though, this works on my computer. I can't vouch for versions 16 or earlier as I no longer have those.

              If you are using version 18 (on Statalist we always assume you are using the current version unless you tell us otherwise), make sure your installation is fully updated.

              Comment


              • #8
                I am using Stata 18 that is fully updated. I figured out why I was getting the generic response without the details of -verbose- option. The using and master file must have the same variables. I had tried to simplify my merge by only having the variables in the using file that I wanted to update and replace in the master. Once I tried the cf command with a using file that had all the same variables ad the master, it worked as expected.

                Thank you again for your assistance Clyde!

                Comment

                Working...
                X