Hi,
I have a question about mass recoding of string variables using loops and locals.
I have a dataset with a bunch of string variables that contain open ended answers from respondents about various topics – (ex. “What is the reason you stopped working?”, “What health conditions did you have as a child?” etc.). The survey was primarily conducted in a language other than English, however many of the respondents and fieldworkers spoke English as a second or third language. Therefore, many of these variables contain answers in two or three different languages. I need to both translate all answers to English, and also standardize answers (so to speak). For example, in the question “What is the reason you stopped working?”, many people gave answers as follows:
All of these answers essentially mean either that they ‘Finished their Job/Contract Ended’, or that they ‘Resigned’. However, while I need to keep the original variable in case other data users want to translate specific answers differently, I want to create a second ‘recoded’ variable that both translates and standardizes answers as such:
I'm not sure if this is the most efficient way to 'translate' strings, but this was the code I had came up with. However, it just seems to delete any matching answers instead of replacing them with the translation and I can’t seem to figure out why.
David
I have a question about mass recoding of string variables using loops and locals.
I have a dataset with a bunch of string variables that contain open ended answers from respondents about various topics – (ex. “What is the reason you stopped working?”, “What health conditions did you have as a child?” etc.). The survey was primarily conducted in a language other than English, however many of the respondents and fieldworkers spoke English as a second or third language. Therefore, many of these variables contain answers in two or three different languages. I need to both translate all answers to English, and also standardize answers (so to speak). For example, in the question “What is the reason you stopped working?”, many people gave answers as follows:
Code:
clear input str50 w2ep214_other "my husband decided to quit then i also resigned" "awuyohela" "End of contract" "iyoti tshikela" "tshikise hi vana" "wuyohela" "contract" "wuyohela" "kuyo hela ntirho" "ulo tshika" "contract" "resign" "end of contract" "yohela" end
Code:
clear input str50(w2ep214_other w2c_ep_214_other_recode) "my husband decided to quit then i also resigned" "Resigned" "awuyohela" "Finished Job/Contract Ended" "End of contract" "Finished Job/Contract Ended" "iyoti tshikela" "Resigned" "tshikise hi vana" "Resigned" "wuyohela" "Finished Job/Contract Ended" "contract" "Finished Job/Contract Ended" "wuyohela" "Finished Job/Contract Ended" "kuyo hela ntirho" "Finished Job/Contract Ended" "ulo tshika" "Resigned" "contract" "Finished Job/Contract Ended" "resign" "Resigned" "end of contract" "Finished Job/Contract Ended" "yohela" "Finished Job/Contract Ended" end
I'm not sure if this is the most efficient way to 'translate' strings, but this was the code I had came up with. However, it just seems to delete any matching answers instead of replacing them with the translation and I can’t seem to figure out why.
Code:
clonevar w2c_ep_214_other_recode = w2ep214_other local E1 "Finished Job/Contract Ended" local E2 "Resigned" input str30 temp_ep214_list_1 // Finished Job/Contract Ended "wuyohela" "yohela" "awuyohela" "kuyo hela ntirho" "contract ended" "contract" "End of contract" "end of contract" end input str30 temp_ep214_list_2 // Resigned "resign" "ulo tshika" "iyoti tshikela" "my husband decided to quit then i also resigned" "tshikise hi vana" End Forvalues i = 1/2 { levelsof temp_ep214_list_`i', local(list`i') foreach v of local list`i' { replace w2c_ep_214_other_recode = regexr(w2c_ep_214_other_recode, "`v'" , "`E`i''") } drop temp_ep214_list_`i' }
Comment