No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fuzzy merge between two large datasets


    I am trying to do a fuzzy merge between two large datasets (500.000 observations and 13.000.000 observations; 5 variables each).

    I do not have any difficulty with the smaller dataset, since I am using it as my basis. Each observation should be referring to one observation of the large dataset.

    The actual problem is that I am trying to do this by using the matchit command, but It takes forever to run (I can not say exactly how long, since It was in a 0% of advance when it had been running for a full day).

    ┬┐Is there any way I could make the same, but faster?

    I considered using the large dataset as my basis, but I think it is not right, since I do not know how many of the observations actually have one equivalent in the smaller dataset.

    Just in case, I am currently using Stata 14.