Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tabmiss sort

    Hi all statalizers,

    I am performing descriptive analysis. I want to list variables with missing values, sorting them by the amount or frequency of missings. Can you think of anything I might use? I have tried mdesc(SSC), misstable, nmissing, missing and tabmiss and checked around the internet, but none of them seems to have options in order to sort.

    Thank you in advance.

  • #2
    Hello Joan,

    Please check if this is what you wish:

    Code:
    . sysuse auto.dta
    (1978 Automobile Data)
    
    . codebook rep78
    
    ---------------------------------------------------------------------------------------------------------------------------------------
    rep78                                                                                                                Repair Record 1978
    ---------------------------------------------------------------------------------------------------------------------------------------
    
                      type:  numeric (int)
    
                     range:  [1,5]                        units:  1
             unique values:  5                        missing .:  5/74
    
                tabulation:  Freq.  Value
                                 2  1
                                 8  2
                                30  3
                                18  4
                                11  5
                                 5  .
    
    . replace rep78 = .a in 3
    (1 real change made, 1 to missing)
    
    . replace rep78 = .c in 7
    (1 real change made, 1 to missing)
    
    . replace rep78 = .c in 45
    (1 real change made, 1 to missing)
    
    . replace rep78 = .a in 51
    (1 real change made, 1 to missing)
    
    . list rep78 if missing(rep78)
    
         +-------+
         | rep78 |
         |-------|
      3. |    .a |
      7. |    .c |
     45. |    .c |
     51. |    .a |
     64. |     . |
         +-------+
    
    . gsort rep78, mfirst
    
    . list rep78 if missing(rep78)
    
         +-------+
         | rep78 |
         |-------|
     70. |     . |
     71. |    .a |
     72. |    .a |
     73. |    .c |
     74. |    .c |
         +-------+

    Best,

    Marcos
    Best regards,

    Marcos

    Comment


    • #3
      mdesc is from SSC (Rose Medeiros)

      misstable is official.

      nmissing is from Stata Journal (myself),

      missing: do you mean missings (myself, Stata Journal)?

      tabmiss is from SSC (
      Marcelo Coca-Perraillon).

      You're likely to be right. But it's programmable.

      Code:
      webuse nlswork, clear
      
      unab all : _all
      
      gen nmissing = .
      
      gen varname = ""
      
      local i = 1
      qui foreach v of local all {
          count if missing(`v')
          replace nmissing = r(N) in `i'
          replace varname = "`v'" in `i'
          local ++i
      }
      
      gsort -nmissing
      
      list varname nmissing if inrange(nmissing, 1, .) , noobs sep(0)
      
        +---------------------+
        |  varname   nmissing |
        |---------------------|
        |    union       9296 |
        |   wks_ue       5704 |
        | wks_work        703 |
        |   tenure        433 |
        | ind_code        341 |
        | occ_code        121 |
        |    hours         67 |
        |      age         24 |
        |      msp         16 |
        |  nev_mar         16 |
        |   c_city          8 |
        | not_smsa          8 |
        |    south          8 |
        |    grade          2 |
        +---------------------+
      Last edited by Nick Cox; 07 Jun 2016, 06:53.

      Comment


      • #4
        How can I add column percentages to this list?

        Comment


        • #5
          Joan:
          shamelessly taking advantage of Nick's code, you may want to try:
          Code:
          g perc_nmissing= nmissing/_N
          format perc_nmissing %12.4f
          list varname nmissing perc_nmissing if inrange(nmissing, 1, .) , noobs sep(0)
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment

          Working...
          X