Dear Community,
I have the following two string variables in the dataset:
Basically, I want to construct a dummy variable that satisfies three criteria:
1) If the judgename variable string is found anywhere in the pco_pakistanallyears variable and then generate a new variable that takes the value of 1 if the string entry exists in both variables
2) This dummy should take the value of 0 if string entry does NOT exist in pco_pakistanallyears variable but does exist in judgename (missing otherwise)
3) Since there are multiple same entries for judgenames variable i.e. the same judge is mentioned multiple times, I want the new variable to take the value of 1 only one time i.e. when that judge was mentioned the first time.
Count match satisfies the first two conditions but not the third one. The code, I tried by using help of count match is as follows:
However, I want to count and construct the variable for unique matches of judgesname with pco_pakistanallyears. As per stata, help of countmatch, I probably, I have to use tag() function but everything I have tried does NOT work e.g. countmatch judgename pco_pakistanallyears, egen = tag(pcojudgesdummy) does not work
Can any body help me out here? Thank you in advance!
Cheers,
Sultan
I have the following two string variables in the dataset:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str32 judgename str26 pco_pakistanallyears "A.N. Bhandari" "" "Aalia Neelam" "" "Aalia Neelam" "" "Aalia Neelam" "" "Aalia Neelam" "" "Aalia Neelam" "" "Aalia Neelam" "" "Aalia Neelam" "" "Aalia Neelam" "" "Aalia Neelam" "" "Aalia Neelam" "A.N. Bhandari" end
Basically, I want to construct a dummy variable that satisfies three criteria:
1) If the judgename variable string is found anywhere in the pco_pakistanallyears variable and then generate a new variable that takes the value of 1 if the string entry exists in both variables
2) This dummy should take the value of 0 if string entry does NOT exist in pco_pakistanallyears variable but does exist in judgename (missing otherwise)
3) Since there are multiple same entries for judgenames variable i.e. the same judge is mentioned multiple times, I want the new variable to take the value of 1 only one time i.e. when that judge was mentioned the first time.
Count match satisfies the first two conditions but not the third one. The code, I tried by using help of count match is as follows:
Code:
countmatch judgename pco_pakistanallyears, gen(pcojudgesdummy)
Can any body help me out here? Thank you in advance!
Cheers,
Sultan
Comment