Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to identify all cases with missing observation

    Dear All

    How can I identify and list all cases with missing observation(denoted by .a) from a data set. Cases can have missing observation for any variable and once there is missing observation for a variable that case should be considered as having missing data.

    An example of my dateset is below, (An example of the dataset containing only few cases, my dataset contains 12000 cases so I want to know if there is an easier way to do)

    copy starting from the next line ------ ----------------
    Code:
    * Example generated by -dataex-. To install: ssc install    dataex
    clear
    input byte(c2__1 c2__2 c2__3 c2__4 c2__5)
    .a  0  0 .a 0
    0  0  0  1 0
    0  1  1  0 0
    .a  1  0  0 0
    .  .  .  . .
    .a  .  .  . .
    1  0  1  1 0
    .  .  .  . .
    0  1  0  0 0
    0 .a .a  0 0
    1  0  0  0 0
    .  .  .  . .
    1  0  1  0 0
    1  1  1  0 1
    .  .  .  . .
    .  .  .  . .
    0 .a  1  1 0
    .  .  .  . .
    .  .  .  . .
    .  .  .  . .
    end
    copy up to and including the previous line - ----------------

  • #2
    You typically do not need to identify cases with missing observations. Whenever you start calculating anything, Stata will automatically drop the observations for which any variable necessary for the calculation is missing.

    Alternatively, be more specific regarding what you want to do--the variables with missing cases are already identified, Stata knows that they are missing, the question is what you want to do with this.

    Comment


    • #3
      And here is an article that deals with missings
      https://journals.sagepub.com/doi/pdf...867X1501500413

      and offers a user written command

      Code:
      findit missings
      and follow the instructions to install it. Then you can get various reports, e.g.,

      Code:
      . missings list
      
      Checking missings in all variables:
      13 observations with missing values
      
           +---------------------------------------+
           | c2__1   c2__2   c2__3   c2__4   c2__5 |
           |---------------------------------------|
        1. |    .a       0       0      .a       0 |
        4. |    .a       1       0       0       0 |
        5. |     .       .       .       .       . |
        6. |    .a       .       .       .       . |
        8. |     .       .       .       .       . |
           |---------------------------------------|
       10. |     0      .a      .a       0       0 |
       12. |     .       .       .       .       . |
       15. |     .       .       .       .       . |
       16. |     .       .       .       .       . |
       17. |     0      .a       1       1       0 |
           |---------------------------------------|
       18. |     .       .       .       .       . |
       19. |     .       .       .       .       . |
       20. |     .       .       .       .       . |
           +---------------------------------------+

      Comment


      • #4
        There might not be good reasons for generating such a variable, as poined out by Joro. However, as your variables contain a stub and noting that extended missings are evaluated larger than system missings, you can reshape to get groups (rows) with extended missings. The code below assumes that the only extended missing in a row is ".a", as in your example. For any missing value, see egen's function -rowmiss()-


        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input byte(c2__1 c2__2 c2__3 c2__4 c2__5)
        .a  0  0 .a 0
         0  0  0  1 0
         0  1  1  0 0
        .a  1  0  0 0
         .  .  .  . .
        .a  .  .  . .
         1  0  1  1 0
         .  .  .  . .
         0  1  0  0 0
         0 .a .a  0 0
         1  0  0  0 0
         .  .  .  . .
         1  0  1  0 0
         1  1  1  0 1
         .  .  .  . .
         .  .  .  . .
         0 .a  1  1 0
         .  .  .  . .
         .  .  .  . .
         .  .  .  . .
        end
        
        
        gen obsno=_n
        reshape long c2__, i(obsno) j(which)
        bys obsno (c2__): gen wanted=c2__[_N]==.a
        reshape wide c2__, i(obsno) j(which)

        Res.:

        Code:
        . l, sep(20)
        
             +--------------------------------------------------------+
             | obsno   c2__1   c2__2   c2__3   c2__4   c2__5   wanted |
             |--------------------------------------------------------|
          1. |     1      .a       0       0      .a       0        1 |
          2. |     2       0       0       0       1       0        0 |
          3. |     3       0       1       1       0       0        0 |
          4. |     4      .a       1       0       0       0        1 |
          5. |     5       .       .       .       .       .        0 |
          6. |     6      .a       .       .       .       .        1 |
          7. |     7       1       0       1       1       0        0 |
          8. |     8       .       .       .       .       .        0 |
          9. |     9       0       1       0       0       0        0 |
         10. |    10       0      .a      .a       0       0        1 |
         11. |    11       1       0       0       0       0        0 |
         12. |    12       .       .       .       .       .        0 |
         13. |    13       1       0       1       0       0        0 |
         14. |    14       1       1       1       0       1        0 |
         15. |    15       .       .       .       .       .        0 |
         16. |    16       .       .       .       .       .        0 |
         17. |    17       0      .a       1       1       0        1 |
         18. |    18       .       .       .       .       .        0 |
         19. |    19       .       .       .       .       .        0 |
         20. |    20       .       .       .       .       .        0 |
             +--------------------------------------------------------

        Comment

        Working...
        X