Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Ploting missing varaibles

    Dear all,
    I am working in an organization where we do perception survey annually, with over 100 questions every year. What I want to do is to visualize missing values in the data set.

    I searched about it a lot and finally I found that we can visualize missing values using missingplot user written command, but this is not what I want to do.
    My aim is to compare missing and non-missing values of all variables in a bar graph.

    After two days of efforts, I programmed my own command as below:

    Code:
    cap program drop missplot
    program def missplot
    syntax [varlist] [if] [, val ]
    
     qui snapshot save
     local snap = `r(snapshot)'
     
    cap keep `if'
    keep `varlist'
    
    foreach i of varlist * {
    cap replace `i' = 0 if `i'!=.
    cap replace `i' = "0" if `i' != ""
    cap replace `i' = 1 if `i' ==.
    cap replace `i' = "1" if `i' == ""
    cap destring `i', replace
    }
    label drop _all
    
    qui foreach i of varlist * {
    sum `i'
    if `r(mean)' == 0 {
    drop `i'
    }
    }
    
    qui des
    if `r(k)' == 0 {
    dis as text in red ". No missing values found"
     snapshot restore `snap'
     snapshot erase `snap'
    exit
    }
    else
    {
    
    qui ds
    local missvars = "`r(varlist)'"
    rename (*) (miss*)
    gen id = _n
    qui reshape long miss, i(id) j(vars, "string")
    
    lab def miss 1"Missing values" 0"Non-missing values"
    lab val miss miss
    dis as text ". Please make sure you have installed tabplot command:" in red  " ssc install tabplot"
    tabplot miss var, showval sep(miss) bar1(bcolor(blue)) bar2(bcolor(red)) xtitle("") ytitle("") subtitle("")
    }
    
     if "`val'" ~= "" {
    collapse (count) miss if miss==1, by(vars)
    gsort -miss
    list , noobs
     }
     else
     {
     snapshot restore `snap'
     snapshot erase `snap'
    exit
     }
    end
    
    webuse studentsurvey, clear
    missplot *
    missplot *, val
    It solves my problem, but this is a little complicated specially for those who are new to Stata.

    Does any one here knows how to plot missing vs non-missing values through a bar chart?, and I would appreciate if anyone could suggest edits on above commands.

    Thank you a lot,
    Fahim



  • #2
    Here's another approach.

    Code:
    webuse studentsurvey, clear
    
    capture program drop anotherplot
    
    program anotherplot
        syntax [varlist] [if] [in]
        
        quietly {
        
        marksample touse, novarlist
    
        tempname freq
        local j = 0
        foreach v of var * {
            count if missing(`v') & `touse'
            if r(N) > 0 {
                local ++j
                gen `freq'`j' = cond(_n == 2, r(N), _N - r(N)) in 1/2
                local label`j' "`v'"
            }
        }
    
        if `j' == 0 {
            display "no missing values"
            exit 0
        }
    
        preserve
    
        keep in 1/2
        keep `freq'*
        tempvar y x
        
        gen `y' = _n
        reshape long `freq' , i(`y') j(`x')
    
        forval j = 1/`j' {
            label define `x' `j' "`label`j''", modify
        }
        label val `x' `x'
        
        label define `y' 1 "non-missing" 2 missing
        label val `y' `y'
        
        } /* end quietly */
    
        tabplot `y' `x' [fw=`freq'], showval ytitle("") xtitle("") ///
        subtitle(counts) sep(`y') ///
        bar1(bfcolor(blue*0.3 blcolor(blue))) bar2(blcolor(red) bfcolor(red*0.3))
    end
    
    anotherplot
    Click image for larger version

Name:	anotherplor.png
Views:	1
Size:	17.1 KB
ID:	1455589

    Last edited by Nick Cox; 29 Jul 2018, 06:02. Reason: and for "greater than" rendering.

    Comment


    • #3
      The previous won't look good if you have just 1 variable with missing values -- or 30 or 300 or 3000. Compare the output of missings (Stata Journal).

      Code:
      . missings report
      
      Checking missings in all variables:
      123 observations with missing values
      
      -----------------
               |     #
      ---------+-------
           age |     3
        female |     3
          dept |     9
       comment |   123
      -----------------
      
      . missings report, sort
      
      Checking missings in all variables:
      123 observations with missing values
      
      -----------------
               |     #
      ---------+-------
       comment |   123
          dept |     9
           age |     3
        female |     3
      -----------------
      
      . missings report, sort percent
      
      Checking missings in all variables:
      123 observations with missing values
      
      --------------------------
               |     #        %
      ---------+----------------
       comment |   123    98.40
          dept |     9     7.20
           age |     3     2.40
        female |     3     2.40
      --------------------------
      Note that (at the time of writing)

      Code:
      search dm0085_1, entry
      is the best way to get a clickable download link.

      Code:
      SJ-17-3 dm0085_1  . . . . . . . . . . . . . . . . Software update for missings
              (help missings if installed)  . . . . . . . . . . . . . . .  N. J. Cox
              Q3/17   SJ 17(3):779
              identify() and sort options have been added
      
      SJ-15-4 dm0085  Speaking Stata: A set of utilities for managing missing values
              (help missings if installed)  . . . . . . . . . . . . . . .  N. J. Cox
              Q4/15   SJ 15(4):1174--1185
              provides command, missings, as a replacement for, and extension
              of, previous commands nmissing and dropmiss

      Comment


      • #4
        Originally posted by Nick Cox View Post
        Here's another approach.

        Code:
        webuse studentsurvey, clear
        
        capture program drop anotherplot
        
        program anotherplot
        syntax [varlist] [if] [in]
        
        quietly {
        
        marksample touse, novarlist
        
        tempname freq
        local j = 0
        foreach v of var * {
        count if missing(`v') & `touse'
        if r(N) > 0 {
        local ++j
        gen `freq'`j' = cond(_n == 2, r(N), _N - r(N)) in 1/2
        local label`j' "`v'"
        }
        }
        
        if `j' == 0 {
        display "no missing values"
        exit 0
        }
        
        preserve
        
        keep in 1/2
        keep `freq'*
        tempvar y x
        
        gen `y' = _n
        reshape long `freq' , i(`y') j(`x')
        
        forval j = 1/`j' {
        label define `x' `j' "`label`j''", modify
        }
        label val `x' `x'
        
        label define `y' 1 "non-missing" 2 missing
        label val `y' `y'
        
        } /* end quietly */
        
        tabplot `y' `x' [fw=`freq'], showval ytitle("") xtitle("") ///
        subtitle(counts) sep(`y') ///
        bar1(bfcolor(blue*0.3 blcolor(blue))) bar2(blcolor(red) bfcolor(red*0.3))
        end
        
        anotherplot
        [ATTACH=CONFIG]n1455589[/ATTACH]

        Thank you a lot dear Nick Cox,
        when I changed my own program to the one you suggested (anotherplot) I get below error that I can't figure out the reason.

        Code:
        amp; invalid name
        r(198);



        Comment


        • #5
          Thanks for the signal.

          You're seeing HTML mark-up inserted by the forum software.

          I've corrected that and another similar problem in #2. Let me know if there are other difficulties.

          Comment


          • #6
            Originally posted by Nick Cox View Post
            Thanks for the signal.

            You're seeing HTML mark-up inserted by the forum software.

            I've corrected that and another similar problem in #2. Let me know if there are other difficulties.
            It works well now, thanks a lot!

            I changed line #8 from
            Code:
              foreach v of var *
            to
            Code:
              foreach v of var `varlist'
            so in this way I can plot missing values for specific variables .



            Comment


            • #7
              Indeed. That was a bug. The code allowed a specified varlist, but then ignored it. Sorry about that.

              Comment


              • #8
                After all, that was very useful.
                It looks like there are a lot more that I have to learn about Stata programming.

                Comment

                Working...
                X