Dear readers,
"Definition: The Levenshtein distance (or edit distance) between two strings is the minimal number of insertions, deletions, and substitutions of one character for another that will transform one string into the other."
"the longest common subsequence (LCS) metric allows only insertion and deletion, not substitution"
The problem: below are 2 examples. I want to keep the first example and eventually drop the second example. I believe that it's possible with the LCS-method to distinct the 1st example with the 2nd in a good manner, however, I don't know how to use this command in Stata. (I've already searched the Internet for 3,5 hours).
mgr_name _merge Fundmanagername levendistance
J. Luther King, Jr. 3 Luther King 8
mgr_name _merge Fundmanagername levendistance
Matthew Friedman 3 Matthew Schuldt 8
I appreciate any suggestion and any other method to make a distinction between the examples. I've more than 100,000 observations and except for this problem, the multiple datasets were correctly merged.
Kind regards,
Walter Hopkins
"Definition: The Levenshtein distance (or edit distance) between two strings is the minimal number of insertions, deletions, and substitutions of one character for another that will transform one string into the other."
"the longest common subsequence (LCS) metric allows only insertion and deletion, not substitution"
The problem: below are 2 examples. I want to keep the first example and eventually drop the second example. I believe that it's possible with the LCS-method to distinct the 1st example with the 2nd in a good manner, however, I don't know how to use this command in Stata. (I've already searched the Internet for 3,5 hours).
mgr_name _merge Fundmanagername levendistance
J. Luther King, Jr. 3 Luther King 8
mgr_name _merge Fundmanagername levendistance
Matthew Friedman 3 Matthew Schuldt 8
I appreciate any suggestion and any other method to make a distinction between the examples. I've more than 100,000 observations and except for this problem, the multiple datasets were correctly merged.
Kind regards,
Walter Hopkins
Comment