Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Foreach loop to count values >0 across multiple variables

    Hi,

    I am struggling to generate a loop which counts the number of values>0 across a set of 17 encoded string variables.

    I have 17 variables (Bet1 - Bet17) and the data is in wide format. I want to generate a loop that counts the number of cells with a value>0 for each individual in the dataset (n=392), so that the count variable starts at 0 and adds one every time there is a cell>0 across the variables Bet1 to Bet17. It will calculate a total daily bets variable at the end.

    Does anyone know where to start with this? My code so far looks like this:

    local vars Bet1 Bet2 Bet3 Bet4 Bet5 Bet6 Bet7 Bet8 Bet9 Bet10 Bet11 Bet12 Bet13 Bet14 Bet15 Bet16 Bet17

    gen followcount = 0
    foreach v in local var {
    replace followcount = followcount + 1 if "v">0
    }
    ta followcount


    The error I am getting is related to the "v" being an invalid name - I have also tried i and x and neither of those work.


  • #2
    Code:
    `v'
    also,
    Code:
    foreach v of local vars
    Last edited by Øyvind Snilsberg; 13 Dec 2022, 03:46.

    Comment


    • #3
      There are three errors here I can see with different flavours and consequences.

      You defined a local macro vars but then tried to use local var. It is not an error to call a loop over a macro that doesn't exist, but Stata will do nothing.

      However, you typed in local var so as far as Stata is concerned that is a list over two items local and var.

      I am surprised at your error report as the most obvious problem to me is that "v" > 0 is a type mismatch.

      Stata uses double quotes " " for one purpose only, to delimit literal strings, but a literal string can't be compared with a number.

      So, as Oyvind Snilsberg implies your code would have better as

      Code:
      local vars Bet1 Bet2 Bet3 Bet4 Bet5 Bet6 Bet7 Bet8 Bet9 Bet10 Bet11 Bet12 Bet13 Bet14 Bet15 Bet16 Bet17
      
      gen followcount = 0
      foreach v of local vars {
         replace followcount = followcount + 1 if `v' >0
      }
      
      ta followcount
      Your code can be simplified and/or made more bullet-proof in various ways. Here are some.


      Code:
      gen followcount = 0
      forval j = 1/17 {
         replace followcount = followcount + 1 if Bet`j'  > 0
      }
      
      
      // this one excludes missing values, likely to be what you want if any are present 
      gen followcount = 0
      forval j = 1/17 {
         replace followcount = followcount + (Bet`j'  > 0 & Bet`j' < .) 
      }
      
      
      // this one is right if and only if Bet* catches Bet1 ... Bet17 
      // note also the point above about missing values 
      gen followcount = 0
      foreach v of var Bet* {
         replace followcount = followcount + 1 if Bet`j'  > 0
      }



      Comment


      • #4
        Thanks - that lets the code run through but I am getting a value of 2 for all of the individuals (which is incorrect). Could this be to do with the fact that I have encoded the string variables? Does STATA read the encoded variable as being of numerical value in the code?

        Comment


        • #5
          Originally posted by Nick Cox View Post
          There are three errors here I can see with different flavours and consequences.

          You defined a local macro vars but then tried to use local var. It is not an error to call a loop over a macro that doesn't exist, but Stata will do nothing.

          However, you typed in local var so as far as Stata is concerned that is a list over two items local and var.

          I am surprised at your error report as the most obvious problem to me is that "v" > 0 is a type mismatch.

          Stata uses double quotes " " for one purpose only, to delimit literal strings, but a literal string can't be compared with a number.

          So, as Oyvind Snilsberg implies your code would have better as

          Code:
          local vars Bet1 Bet2 Bet3 Bet4 Bet5 Bet6 Bet7 Bet8 Bet9 Bet10 Bet11 Bet12 Bet13 Bet14 Bet15 Bet16 Bet17
          
          gen followcount = 0
          foreach v of local vars {
          replace followcount = followcount + 1 if `v' >0
          }
          
          ta followcount
          Your code can be simplified and/or made more bullet-proof in various ways. Here are some.


          Code:
          gen followcount = 0
          forval j = 1/17 {
          replace followcount = followcount + 1 if Bet`j' > 0
          }
          
          
          // this one excludes missing values, likely to be what you want if any are present
          gen followcount = 0
          forval j = 1/17 {
          replace followcount = followcount + (Bet`j' > 0 & Bet`j' < .)
          }
          
          
          // this one is right if and only if Bet* catches Bet1 ... Bet17
          // note also the point above about missing values
          gen followcount = 0
          foreach v of var Bet* {
          replace followcount = followcount + 1 if Bet`j' > 0
          }




          Thank you, the simplified code worked!

          Comment


          • #6
            Perhaps #4 was answered indirectly by #5 -- it wasn't visible to me while I was typing #5 -- but if not we need to see a data example.

            Note that within 17 relevant variables we don't absolutely need to see them all. You just need to show us a replicable example where the code doesn't do what you want or expect and you don't understand why.

            Stata (*) encodes empty strings as missing values.

            But watch out: encode is often a wrong choice. You may need destring. This dialogue illustrates.

            Code:
            . clear
            
            . set obs 2
            Number of observations (_N) was 0, now 2.
            
            . gen havethis = cond(_n == 1, "42", "")
            (1 missing value generated)
            
            . list
            
                 +----------+
                 | havethis |
                 |----------|
              1. |       42 |
              2. |          |
                 +----------+
            
            . encode havethis, gen(possibleanswer)
            
            . destring havethis, gen(anotheranswer)
            havethis: all characters numeric; anotheranswer generated as byte
            (1 missing value generated)
            
            . list
            
                 +--------------------------------+
                 | havethis   anothe~r   possib~r |
                 |--------------------------------|
              1. |       42         42         42 |
              2. |                   .          . |
                 +--------------------------------+
            
            . list, nola
            
                 +--------------------------------+
                 | havethis   anothe~r   possib~r |
                 |--------------------------------|
              1. |       42         42          1 |
              2. |                   .          . |
                 +--------------------------------+
            encode is for mapping categories to numbers 1 up, So "cat" "dog" "eagle" would get mapped to 1 2 3, with nothing else said. "1000", "200", "30" would also get mapped to 1 2 3, also, almost never what you want. (That is their dictionary order.)

            destring is for replacing numbers misread as strings with numbers.


            Comment

            Working...
            X