Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a variable that records the number of groups per state

    I am currently working with the Ethnic Power Relations dataset and there are two variables I am currently interested in namely statename and group. I am looking to create a variable that shows the number of groups per state. At first, I attempted using "egen group_numb = count(group), by(statename)" but it returns a number of groups much higher than the actual number of groups in the state. For example the state of Sierra Leone had six groups yet this variable was returning the number 54 and not six. The reason why this was happening is that this dataset also considers various years "from" and "to" (creating time spans) during which the names of the groups are repeated:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str84 group
    "Northern Groups (Temne, Limba)"
    "Mende"                         
    "Creole"                        
    "Kono"                          
    "Northern Groups (Temne, Limba)"
    "Mende"                         
    "Creole"                        
    "Kono"                          
    "Mende"                         
    "Temne"                         
    "Limba"                         
    "Creole"                        
    "Kono"                          
    "Mende"                         
    "Temne"                         
    "Limba"                         
    "Creole"                        
    "Kono"                          
    "Northern Groups (Temne, Limba)"
    "Mende"                         
    "Creole"                        
    "Kono"                          
    "Northern Groups (Temne, Limba)"
    "Mende"                         
    "Creole"                        
    "Kono"                          
    "Northern Groups (Temne, Limba)"
    "Mende"                         
    "Creole"                        
    "Kono"                          
    "Northern Groups (Temne, Limba)"
    "Mende"                         
    "Creole"                        
    "Kono"                          
    "Mende"                         
    "Temne"                         
    "Limba"                         
    "Creole"                        
    "Kono"                          
    "Mende"                         
    "Temne"                         
    "Limba"                         
    "Creole"                        
    "Kono"                          
    "Mende"                         
    "Temne"                         
    "Limba"                         
    "Kono"                          
    "Creole"                        
    "Mende"                         
    "Temne"                         
    "Limba"                         
    "Kono"                          
    "Creole"                        
    end
    For this reason, I wanted to ask how could I create a variable that recorded the number of groups per state while circumventing this problem. Thank you for the help!


  • #2
    Your example data is not very helpful since it does not include the statename variable. But I think what you want is this:
    Code:
    by statename (group), sort: gen group_num = sum(group != group[_n-1])
    by statename: gen wanted = group_num[_N]

    Comment

    Working...
    X