I am currently working with the Ethnic Power Relations dataset and there are two variables I am currently interested in namely statename and group. I am looking to create a variable that shows the number of groups per state. At first, I attempted using "egen group_numb = count(group), by(statename)" but it returns a number of groups much higher than the actual number of groups in the state. For example the state of Sierra Leone had six groups yet this variable was returning the number 54 and not six. The reason why this was happening is that this dataset also considers various years "from" and "to" (creating time spans) during which the names of the groups are repeated:
For this reason, I wanted to ask how could I create a variable that recorded the number of groups per state while circumventing this problem. Thank you for the help!
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str84 group "Northern Groups (Temne, Limba)" "Mende" "Creole" "Kono" "Northern Groups (Temne, Limba)" "Mende" "Creole" "Kono" "Mende" "Temne" "Limba" "Creole" "Kono" "Mende" "Temne" "Limba" "Creole" "Kono" "Northern Groups (Temne, Limba)" "Mende" "Creole" "Kono" "Northern Groups (Temne, Limba)" "Mende" "Creole" "Kono" "Northern Groups (Temne, Limba)" "Mende" "Creole" "Kono" "Northern Groups (Temne, Limba)" "Mende" "Creole" "Kono" "Mende" "Temne" "Limba" "Creole" "Kono" "Mende" "Temne" "Limba" "Creole" "Kono" "Mende" "Temne" "Limba" "Kono" "Creole" "Mende" "Temne" "Limba" "Kono" "Creole" end
Comment