Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fuzzy name matching - MATCHIT - diagnose error: "abbreviate must be between 1 and 33"

    Hi,

    I am using the command MATCHIT from Julio Raffo (thank to him!) to fuzzy match names in 2 datasets. I'm using the option diagnose, that I need, and get the following error:
    Code:
    Matching current dataset with HFnames_MorningStar4_bis.dta
    Applying weights function: log
    Similarity function: token
     
    Performing preliminary diagnosis
    --------------------------------
     
    Analyzing Master file
    List of most frequent grams in Master file:
    abbreviate must be between 1 and 33
    I am using the following code:
    Code:
    . matchit umgrno mgrname using HFnames_MorningStar4_bis.dta, idu(HF_MS_No) txtu(H
    > F_MS) w(log) di sim(token) t(0) override
    I am matching names of asset management companies with a lot of names with "capital" "asset" "management" "advisor" "investment" etc...

    Can someone help me to find why I'm getting this error? Thanks!


  • #2
    . set trace on

    See https://www.stata.com/support/faqs/p...gging-program/

    Comment


    • #3
      Thanks Anders, that seem to be a very useful tool actually!

      However, I still cannot find the error. Can anyone who knows MATCHIT help? Thanks.

      Comment


      • #4
        Reduce the tracedepth of the trace to find the error, for example:

        Code:
        set trace on
        set traced 2

        Comment


        • #5
          Hi François,

          It seems to me that the command list is returning an error on the option abbreviate. In the code, I set it to ab(50) and never got an error (I've just retested it in Stata 14 and 15). In these two editions, list will only return an error if you set ab() to >129. Maybe there's a different edition or version of Stata for which this limit is set to 33.

          If you want to circumvent this is issue, you could manually change the ab(50) in the ado file to ab(33). Please note that the diagnose option uses list three times. If more people faces this issue, I'll be happy to correct it myself.

          Best,

          J.

          Comment


          • #6
            I am returning the same error using the diagnose option in Stata IC 13.1.

            Comment


            • #7
              I have updated -matchit- in Github and SSC (Thanks KitBaum!) among other things to handle this issue. Unfortunately, I do not have a copy of Stata13 or lower to fully test it. Much appreciated if Francois Durant, Evan Fuller or someone else could check it and let me know.

              Comment

              Working...
              X