Combining fuzzy and perfect merge

Matheus Proenca

Join Date: Jan 2024

Posts: 1
#1

Combining fuzzy and perfect merge

08 Jan 2024, 08:25

Hi everyone! I have two datasets with the variables "classroom_code" and "student_name". I’m looking for a way to merge these two datasets. I want to allow for a fuzzy match of names (e.g. >.75), while guaranteeing a perfect match for classroom codes (i.e., only matching names if classroom_code is identical). Do you know of a way of doing it? I’m aware of the matchit function for the fuzzy match part, but I don’t think I can add the conditionality of the further variables having to match perfectly, is that right? Thanks a lot!
Tags: None
Andrew Musau

Join Date: Oct 2014

Posts: 10235
#2

08 Jan 2024, 12:15

If you run matchit (SSC) and find a similarity score >.75, you replace the matched names across the datasets with the same name. Then what you want is a standard merge using the replaced name and classroom code as key variables.
Comment

Announcement