Hello all, I have a simple question about merging data sets. I have two data sets, the master data set has 625 obs and the using data set has 705. Each data set contains repeating cities with different street names for each one. The master data set contains info from 2022 and the using contains info from 2021. I want to merge this two in order to add some info to the master that only appears in the the using. (I used m:m and the street names of each city). My question is: why does the sum of Not matched from master (_merge ==1) plus the matched observations in stata after the merge not equal the number of obs from the master data set? Ignoring the not matched obs from using (_merge == 2, meaning info that only the 2021 data set contains and that is no real use to me) wouldn´t total the obs from the master? After I merge the data sets and drop the obs with _merge == 2 the resulting data set has 631 obs and not 625 like I thought it would.
Result --------------- Number of obs
-----------------------------------------
Not matched -------------- 394
from master --------------- 160 (_merge==1)
from using ----------------- 234 (_merge==2)
Matched ------------------- 471 (_merge==3)
-----------------------------------------
Thanks,
Result --------------- Number of obs
-----------------------------------------
Not matched -------------- 394
from master --------------- 160 (_merge==1)
from using ----------------- 234 (_merge==2)
Matched ------------------- 471 (_merge==3)
-----------------------------------------
Thanks,
Comment