I am new to using the matchit command and finding it challenging to understand what the different options mean and which one would be most suitable for my needs.
Dataset 1
Dataset 2 –look up table
The code I am using is the following for example:
use "DIRECTORY-dataset1 ", clear
matchit SAMPLE_ID SOME_KIND_OF_NAME using "directory-dataset2 ", idu(ID) txtu(colors) sim(token) t(0)
MATCH
I am not sure if for this example it would be helpful if I created dummy variables in the proper stata dataex. If so, let me know and I can try to ask my question in a different way with actual data.
Thank you!
Dataset 1
SOME_KIND_OF_NAME |
THESQUIRL WAS YELLOW AND SMOOTH |
THESQUIRL WAS YELLOW SUNSHINE AND SMOOTH |
THESQUIRLWASPURPLE |
BLUE MUFFINS ARE-AWESOME |
BLUE-RAY MUFFINS ARE |
Dataset 2 –look up table
COLORS |
GREEN |
PURPLE |
YELLOW SUNSHINE |
BLUE-RAY |
use "DIRECTORY-dataset1 ", clear
matchit SAMPLE_ID SOME_KIND_OF_NAME using "directory-dataset2 ", idu(ID) txtu(colors) sim(token) t(0)
MATCH
THESQUIRL WAS YELLOW AND SMOOTH | YELLOW SUNSHINE > wrong (I only want it to match if it contains exactly YELLOW SUNSHINE, the words together in the long string) |
THESQUIRL WAS YELLOW SUNSHINE AND SMOOTH | YELLOW SUNSHINE |
THESQUIRLWASPURPLE | PURPLE |
BLUE MUFFINS ARE-AWESOME | BLUE-RAY >wrong (I only want it to match if it contains exactly BLUE-RAY, the words together in the long string) |
BLUE-RAY MUFFINS ARE | BLUE-RAY |
Thank you!
Comment