Fuzzy name matching - MATCHIT - diagnose error: "abbreviate must be between 1 and 33"

Francois Durant

Join Date: Dec 2014

Posts: 761
#1

Fuzzy name matching - MATCHIT - diagnose error: "abbreviate must be between 1 and 33"

21 Apr 2018, 03:54

Hi,

I am using the command MATCHIT from Julio Raffo (thank to him!) to fuzzy match names in 2 datasets. I'm using the option diagnose, that I need, and get the following error:

Code:

Matching current dataset with HFnames_MorningStar4_bis.dta Applying weights function: log Similarity function: token Performing preliminary diagnosis -------------------------------- Analyzing Master file List of most frequent grams in Master file: abbreviate must be between 1 and 33

I am using the following code:

Code:

. matchit umgrno mgrname using HFnames_MorningStar4_bis.dta, idu(HF_MS_No) txtu(H > F_MS) w(log) di sim(token) t(0) override

I am matching names of asset management companies with a lot of names with "capital" "asset" "management" "advisor" "investment" etc...

Can someone help me to find why I'm getting this error? Thanks!
Tags: None
Anders Alexandersson

Join Date: Apr 2014

Posts: 203
#2

21 Apr 2018, 07:02

. set trace on

See https://www.stata.com/support/faqs/p...gging-program/
1 like
Comment
Francois Durant

Join Date: Dec 2014

Posts: 761
#3

21 Apr 2018, 08:31

Thanks Anders, that seem to be a very useful tool actually!

However, I still cannot find the error. Can anyone who knows MATCHIT help? Thanks.
Comment
Anders Alexandersson

Join Date: Apr 2014

Posts: 203
#4

21 Apr 2018, 10:09

Reduce the tracedepth of the trace to find the error, for example:

Code:

set trace on set traced 2
Comment
Julio Raffo

Join Date: May 2014

Posts: 132
#5

25 Apr 2018, 08:55

Hi François,

It seems to me that the command list is returning an error on the option abbreviate. In the code, I set it to ab(50) and never got an error (I've just retested it in Stata 14 and 15). In these two editions, list will only return an error if you set ab() to >129. Maybe there's a different edition or version of Stata for which this limit is set to 33.

If you want to circumvent this is issue, you could manually change the ab(50) in the ado file to ab(33). Please note that the diagnose option uses list three times. If more people faces this issue, I'll be happy to correct it myself.

Best,

J.
Comment
Evan Fuller

Join Date: Jun 2018

Posts: 2
#6

04 Sep 2018, 15:10

I am returning the same error using the diagnose option in Stata IC 13.1.
Comment
Julio Raffo

Join Date: May 2014

Posts: 132
#7

15 Apr 2019, 07:09

I have updated -matchit- in Github and SSC (Thanks KitBaum!) among other things to handle this issue. Unfortunately, I do not have a copy of Stata13 or lower to fully test it. Much appreciated if Francois Durant, Evan Fuller or someone else could check it and let me know.
Comment

Announcement

Fuzzy name matching - MATCHIT - diagnose error: "abbreviate must be between 1 and 33"

Comment

Comment

Comment

Comment

Comment

Comment