Dear STATA users
I am interested in a specific part of a string, which is always in the same position; I have to extract it and input it in a new string.
In particular, I am interested in the psycotherapists' title which is highlighted in red
Note that within the part of the string which occupied by name and surname, there might be punctuation or other sybmbols--& (this is because name and surname might be the name of the therapists' practice and not exclusively their own name); however, there is never a blank space.
Based on some useful online examples I came up with the following:
In my mind, the regexm function should take on the first combination of any character and length (from the left--this is why there is $ at the beginning), that:
Differently, the new string should look like follows
Any idea on how to build TitleClean?
I am interested in a specific part of a string, which is always in the same position; I have to extract it and input it in a new string.
In particular, I am interested in the psycotherapists' title which is highlighted in red
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str92 Title "namesurname|||Counselor,||MA,||LPCC-S,|||NCC,||RPT-S,||TCADC||||" "name-surname|||Counselor,||LCPC|||" "name&surn,ame|||ClinicalSocialWork/Therapist,||MSW,||LCSW,||ADS||||" "name,surn-ame|||LicensedProfessionalCounselor,|||MA,||LPC,||NCC|||" "name.s;urname|||LicensedProfessionalCounselor,|||MA,||LPC|||" end
Based on some useful online examples I came up with the following:
Code:
gen TitleClean = regexs(1) if regexm(Title, "$.*[|||](.*)[,||].*")
- comes after any combination of characters ".*" followed by "|||"
- comes before a comma and two bars ",||" , which is followed by any combination of charracters ".*"
Differently, the new string should look like follows
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str92 TitleClean "Counselor" "Counselor" "ClinicalSocialWork/Therapist" "LicensedProfessionalCounselor" "LicensedProfessionalCounselor" end
Any idea on how to build TitleClean?
Comment