Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting observations with no missing data

    Hi: I'm fairly new to STATA and was hoping to get some help. I have >100K observations with ~120 variables. Is there a code to count the number of observations with no missing data across the 120 variables? Thanks!

  • #2
    Code:
    egen miss = rowmiss(*)
    count if miss == 0

    Comment


    • #3
      That didn't work... count came back as 0. Is miss a new variable?

      Comment


      • #4
        Yes, miss is a new variable whose every observation contains the number of your existing variables that have a missing value in that observation. If the count is coming back as zero, that suggests that there are no observations that have no missing values on any variable -- there is something or the other that is missing in every observation. If you know this is untrue, perhaps you can start us off with a data extract using the dataex command, as suggested in the Statalist FAQ. That will help troubleshoot.
        Last edited by Hemanshu Kumar; 30 Oct 2022, 08:58.

        Comment


        • #5
          Downloadable command mvpatterns can also tell you the missing pattern, and magnitude of missing for each variable:

          Code:
          net install dm91
          
          sysuse auto, clear
          mvpatterns make - gear_ratio
          Results:
          Code:
          . mvpatterns make - gear_ratio
          variables with no mv's: make price mpg headroom trunk weight length turn displacement gear_ratio
          
          Variable     | type     obs   mv   variable label
          -------------+---------------------------------------
          rep78        | int       69    5   Repair record 1978
          -----------------------------------------------------
          
          Patterns of missing values
          
            +------------------------+
            | _pattern   _mv   _freq |
            |------------------------|
            |        +     0      69 |
            |        .     1       5 |
            +------------------------+
          Just make sure the 120 variables are stored next to each other as a whole chunk, that way you can use a dash (-) to input them as an abstracted list, like:

          Code:
          mvpatterns variable001- variable120
          Also, regarding thread #3, it's possible the count is 0, it just means that there is no case with no missing. If your 120 variables include opened ended questions as a follow up of "others", it's likely that you'd have 0. Try also tabulate miss and see the distribution of the other numbers.

          Comment


          • #6
            Ken Chui

            There are several such and let's first note misstable as the quite versatile official command.

            I note mdesc from SSC and missings from the Stata Journal, and I wouldn't be surprised at yet others.

            Comment


            • #7
              Originally posted by Nick Cox View Post
              Ken Chui

              There are several such and let's first note misstable as the quite versatile official command.

              I note mdesc from SSC and missings from the Stata Journal, and I wouldn't be surprised at yet others.
              Thanks, Nick. Noted. I'll take some time to get familiar with misstable.

              Comment

              Working...
              X