Fuzzy merge between two large datasets

Isidora Vergara

Join Date: May 2019

Posts: 18
#1

Fuzzy merge between two large datasets

04 Dec 2019, 13:31

Hi,

I am trying to do a fuzzy merge between two large datasets (500.000 observations and 13.000.000 observations; 5 variables each).

I do not have any difficulty with the smaller dataset, since I am using it as my basis. Each observation should be referring to one observation of the large dataset.

The actual problem is that I am trying to do this by using the matchit command, but It takes forever to run (I can not say exactly how long, since It was in a 0% of advance when it had been running for a full day).

¿Is there any way I could make the same, but faster?

I considered using the large dataset as my basis, but I think it is not right, since I do not know how many of the observations actually have one equivalent in the smaller dataset.

Just in case, I am currently using Stata 14.

Greetings,
Isidora.
Tags: None

Announcement