Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • missing values with egen anycount

    Hi there. Is it possible to generate a newvar that counts up the number of particular values for a varlist but excludes the missing values? How do I make sure that the 0 does not include missing (.) values?

  • #2
    I don't follow what you want here. If you count anything but missing, then missing values are ignored.

    If you're looking at how many 7s there are, a count of 0 won't tell you whether some or all of the values seen were non-missing, just not 7, or whether some or all of the values seen were missing.

    See for example

    Code:
    . clear
    
    . set obs 3
    number of observations (_N) was 0, now 3
    
    . gen y = cond(_n == 1, 7,  cond(_n == 2, 42, .))
    (1 missing value generated)
    
    . l y
    
         +----+
         |  y |
         |----|
      1. |  7 |
      2. | 42 |
      3. |  . |
         +----+
    
    . egen count7 = anycount(y), values(7)
    
    . l
    
         +-------------+
         |  y   count7 |
         |-------------|
      1. |  7        1 |
      2. | 42        0 |
      3. |  .        0 |
         +-------------+
    Otherwise put, if you want to keep track of the number of missing values in any observation in certain variables, you need to do that directly. In egen use rowmiss().

    Comment


    • #3
      Sounds to me like you want -egen- rownonmiss(varlist) [, strok]
      may not be combined with by. It gives the number of nonmissing values in varlist for each observation (row) -- this is the value used by
      rowmean() for the denominator in the mean calculation.

      Comment


      • #4
        Thank you for your reply, Nick. I would like to create a variable, say count7 in your example that reports 1 for 7, 0 for 42, and . if the value is missing like in the third row of your example. Perhaps anycount will not work. I don't want the missing value (.) to be counted as 0. In other words, I care both that seven is counted correctly but I also want the non-seven values to be integer values and not missing values, if that makes sense.

        Comment


        • #5
          anycount() tells you the truth: it counts what you ask. But if you want missing as the result if any argument is missing, then you can write

          1. your alternative to anycount()

          2. your own loop.

          Code:
          gen wanted = 0
          
          foreach v of local varlist  {
                 replace wanted = cond(missing(`v'), ., wanted + (`v' == 7)) if wanted < .
          }
          where 7 is just an example you can generalise.

          3. your fix after counting missings


          Code:
          egen wanted = anycount(varlist), value(7)
          egen nmissing = rowmiss(varlist)
          replace wanted = . if nmissing

          Last edited by Nick Cox; 23 Jul 2020, 11:46.

          Comment


          • #6
            thank you, Nick. That worked!

            Comment

            Working...
            X