Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • using rowtotal and strpos to sum values of certain variables that contain a sp

    Dear Statalist,
    I have a dataset where part of my variables are political candidates. The variable names are composed of different sub-blocs that identify the characteristics of those politicians. For example, aaabbbccc1, where aaa is their state, bbb is their district, ccc is their affiliation and 1/0 whether they won or not the seat or not.

    I would like to know how could I use rowtotal for only certain variables in my dataset that contain a certain letter in a certain position of their name. In particular, how to you rowtotal if the string variable name contains "ccc" only in the 7th to 9th position. I tried using *ccc* but that includes any variable that contains ccc, but I'd like only those that contain ccc in the 7th to 9th position.

    Any help on how could I do that?

    Thank you very much for the help.
    Daniel

  • #2
    not sure I completely understand (e.g., when you say "variable names" I assume you actually mean "variable values") but here is a guess for the "if" condition of your egen statement; since you did not give us the variable name at issue I use "varname" - just replace it with your actual variable name
    Code:
    if inrange(strpos(varname,"ccc"),7,9)

    Comment


    • #3
      Thank you Rich for the quick reply. I meant the "variable names" not the "variable values". I have 100 variables that each represent a candidate in an election. Then each row has the number of votes each candidate obtained in each electoral district. Instead of the name of the candidate, the variable name is the combination of several characteristics of each candidate. For example, the variable name "aaabbbccc1" is a candidate from the State "aaa", the district "bbb", from the political party "ccc" and won a seat in the elections (1, 0 if he lost).

      So what I want is to sum the total votes in each electoral district that candidates from one party obtained. For that, I was thinking of doing rowtotal for all variables that have a name that includes "ccc" between the 7th and 9th position. Would your code work in that case where I'm summing different variables that have a certain condition, not values within a variable?

      Thanks again.
      Daniel

      Comment


      • #4
        How could I replace the code
        egen x=rowtotal(*ccc*)
        to specify that I want all variables that have a name that contain "ccc" but only in the position 7th to 9th of the name?

        Comment


        • #5
          The output of help varlist tells us that in a variable list, the wild card "?" will match a single character.
          Code:
          . list, clean abbreviate(10)
          
                 ggggggccc1   cccgggggg1   gggcccccc0  
            1.            1            1            1  
          
          . list ??????ccc?, clean abbreviate(10)
          
                 ggggggccc1   gggcccccc0  
            1.            1            1

          Comment


          • #6
            Thank you William. It was very helpful and worked well. Best

            Comment

            Working...
            X