Here is an excerpt from my data:
I would like to create a dummy that is 1 for each id_notice_cn if there are additional ("overlapping") languages contained in admin_languages_tender that are not contained in OfficialLanguages. For example, 20221033 and 20221034 contain "DE" and "IT" for admin_languages_tender. In OfficialLanguages there is only IT contained. Hence, the dummy should be 1 for those two observations.
The tricky thing is that the dummy should not be sensitive to the order of languages in the language codes (e.g. for some observations it might be "DE|IT", for others "IT|DE" and so on but that is not supposed to change the result)..
What I tried so far is the following:
However, that produces not the desired result.
Any help would be greatly appreciated.
Code:
input id_notice_cn iso_country_code admin_languages_tender OfficialLanguages 20221025 "IT" "IT" "IT" 20221026 "PL" "PL" "PL" 20221027 "FR" "FR" "FR" 20221028 "FR" "FR" "FR" 20221029 "FR" "FR" "FR" 20221030 "FI" "FI" "FI|SV" 20221031 "DE" "DE" "DE" 20221032 "IT" "IT" "IT" 20221033 "IT" "DE|IT" "IT" 20221034 "IT" "DE|IT" "IT" 20221035 "ES" "CA|ES" "ES|CA|GL|EU" 20221036 "LT" "LT" "LT"
The tricky thing is that the dummy should not be sensitive to the order of languages in the language codes (e.g. for some observations it might be "DE|IT", for others "IT|DE" and so on but that is not supposed to change the result)..
What I tried so far is the following:
Code:
sort id_notice_cn gen lang_diff = 0 split OfficialLanguages, p("|") foreach langvar in OfficialLanguages1 OfficialLanguages2 OfficialLanguages3 OfficialLanguages4 { replace lang_diff = 1 if strpos(admin_languages_tender, `langvar') == 0 & admin_languages_tender != "" }
Any help would be greatly appreciated.
Comment