Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • missings updated on SSC

    Thanks to Kit Baum as ever, the command missings has been updated on SSC, in advance of a formal update in the Stata Journal, which likely would not appear before June in any case.

    Code:
    ssc install missings
    -- except that if you have installed it already, watch out. The program now requires Stata 12, not Stata 9, so anyone who is on Stata 9, 10 or 11, and has previously installed it. should not update. I hope that is a small number of readers, ideally zero.

    The command missings is perhaps too easy to miss (as it were) because its name is close to missing, a search for which yields a great deal. missings has been published and revised through the Stata Journal

    Code:
    SJ-20-4 dm0085_2  . . . . . . . . . . . . . . . . Software update for missings
            (help missings if installed)  . . . . . . . . . . . . . . .  N. J. Cox
            Q4/20   SJ 20(4):1028--1030
            sorting has been extended for missings report
    
    SJ-17-3 dm0085_1  . . . . . . . . . . . . . . . . Software update for missings
            (help missings if installed)  . . . . . . . . . . . . . . .  N. J. Cox
            Q3/17   SJ 17(3):779
            identify() and sort options have been added
    
    SJ-15-4 dm0085  Speaking Stata: A set of utilities for managing missing values
            (help missings if installed)  . . . . . . . . . . . . . . .  N. J. Cox
            Q4/15   SJ 15(4):1174--1185
            provides command, missings, as a replacement for, and extension
            of, previous commands nmissing and dropmiss
    so that dm0085 is an otherwise unpredictable search term -- both in Stata and (more importantly) if looking for previous mentions on Statalist.

    missings is a bundle of utilities which overlaps a little with official command misstable. I note that it can do everything that mdesc, a popular download from SSC, can do, and much more besides.

    What is new is a subcommand missings breakdown, best explained by a few simple examples.

    We read in a standard dataset and add a few more exotic variables.


    Code:
    . webuse nlswork
    (National Longitudinal Survey of Young Women, 14-24 years old in 1968)
    
    
    . gen frog  = .x
    (28,534 missing values generated)
    
    . gen toad = "toad" if mod(_n, 2)
    (14,267 missing values generated)
    A minimal breakdown cross-tabulates variables and distinct flavours of missing values, whether string or numeric.

    Code:
    . missings breakdown
    
    Checking missings in all variables:
    28534 observations with missing values
    
      +---------------------------------------------+
      |            # missing   empty      .      .x |
      |---------------------------------------------|
      |      age          24       .     24       0 |
      |      msp          16       .     16       0 |
      |  nev_mar          16       .     16       0 |
      |    grade           2       .      2       0 |
      | not_smsa           8       .      8       0 |
      |---------------------------------------------|
      |   c_city           8       .      8       0 |
      |    south           8       .      8       0 |
      | ind_code         341       .    341       0 |
      | occ_code         121       .    121       0 |
      |    union        9296       .   9296       0 |
      |---------------------------------------------|
      |   wks_ue        5704       .   5704       0 |
      |   tenure         433       .    433       0 |
      |    hours          67       .     67       0 |
      | wks_work         703       .    703       0 |
      |     frog       28534       .      0   28534 |
      |---------------------------------------------|
      |     toad       14267   14267      .       . |
      +---------------------------------------------+
    There are options to go beyond that, such as focusing on numeric or string variables, or sorting the display.

    Code:
    . missings breakdown, numeric sort(missings)
    
    Checking missings in all numeric variables:
    28534 observations with missing values
    
      +-------------------------------------+
      |            # missing      .      .x |
      |-------------------------------------|
      |    grade           2      2       0 |
      |    south           8      8       0 |
      | not_smsa           8      8       0 |
      |   c_city           8      8       0 |
      |  nev_mar          16     16       0 |
      |-------------------------------------|
      |      msp          16     16       0 |
      |      age          24     24       0 |
      |    hours          67     67       0 |
      | occ_code         121    121       0 |
      | ind_code         341    341       0 |
      |-------------------------------------|
      |   tenure         433    433       0 |
      | wks_work         703    703       0 |
      |   wks_ue        5704   5704       0 |
      |    union        9296   9296       0 |
      |     frog       28534      0   28534 |
      +-------------------------------------+
    
    . missings breakdown, numeric sort(missings descending)
    
    Checking missings in all numeric variables:
    28534 observations with missing values
    
      +-------------------------------------+
      |            # missing      .      .x |
      |-------------------------------------|
      |     frog       28534      0   28534 |
      |    union        9296   9296       0 |
      |   wks_ue        5704   5704       0 |
      | wks_work         703    703       0 |
      |   tenure         433    433       0 |
      |-------------------------------------|
      | ind_code         341    341       0 |
      | occ_code         121    121       0 |
      |    hours          67     67       0 |
      |      age          24     24       0 |
      |  nev_mar          16     16       0 |
      |-------------------------------------|
      |      msp          16     16       0 |
      |    south           8      8       0 |
      |   c_city           8      8       0 |
      | not_smsa           8      8       0 |
      |    grade           2      2       0 |
      +-------------------------------------+

    There is more, but you probably know by now whether this is of interest -- especially if you have not been aware of it before.

    The update was stimulated by a question from Jørgen Carling here on Statalist, https://www.statalist.org/forums/for...issings-report Like me, he's a geographer!

    Last edited by Nick Cox; 28 Jan 2023, 10:57.
Working...
X