Dear Statalist, I would like to ask you if possible, about the matchit command, however, as this questions is a little bit different from another question I did about matchit, I put it in a different post (my apologize in advance).
I want to find for each firm listed in the Patent dataset, that same firm in Amadeus (firm) dataset. I would like to know how to eliminate the matched firm from both datasets (patents and firms [Amadeus]) after doing the matchit in a recursive way. I mean, let’s say that first I do a merge for equal cases (same firm names) between the two datasets. Those firms matched should be eliminated from the matchit process I will do next in both datasets.
For those not merged, then I would use matchit. However, when using matchit, it is possible that some firms from Amadeus dataset duplicates in different matches in the patent dataset (two different firm's name from patent data uses the same matched firm name from Amadeus data). For those cases, I would need to keep as matched the line with the highest score (and eliminate those firm's name that were matched from both datasets (patents and firms). And redo the matchit for those with a lower score (poorer matching) and for those that were using the same duplicated firm's name but with a lower score. This should be done until all firms in the patent dataset are matched without these duplications.
Thereafter, I would need to joint all the matches in a single file. Is it possible to do this matchit process in a recursive way that store the matches and redo the matchit for the duplicated (with lower score) cases and poorer (say lower than 0.9) matches?
I do not know how to copy/paste the dataex from my previous post about matchit. But you may find it in #1: Matching patent and firm-level data using matchit - Statalist
I want to find for each firm listed in the Patent dataset, that same firm in Amadeus (firm) dataset. I would like to know how to eliminate the matched firm from both datasets (patents and firms [Amadeus]) after doing the matchit in a recursive way. I mean, let’s say that first I do a merge for equal cases (same firm names) between the two datasets. Those firms matched should be eliminated from the matchit process I will do next in both datasets.
For those not merged, then I would use matchit. However, when using matchit, it is possible that some firms from Amadeus dataset duplicates in different matches in the patent dataset (two different firm's name from patent data uses the same matched firm name from Amadeus data). For those cases, I would need to keep as matched the line with the highest score (and eliminate those firm's name that were matched from both datasets (patents and firms). And redo the matchit for those with a lower score (poorer matching) and for those that were using the same duplicated firm's name but with a lower score. This should be done until all firms in the patent dataset are matched without these duplications.
Thereafter, I would need to joint all the matches in a single file. Is it possible to do this matchit process in a recursive way that store the matches and redo the matchit for the duplicated (with lower score) cases and poorer (say lower than 0.9) matches?
I do not know how to copy/paste the dataex from my previous post about matchit. But you may find it in #1: Matching patent and firm-level data using matchit - Statalist

Comment