Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Describing missing data

    Hi, I'm a new Stata user and I've gone around in circles a bit with this. I'm basically trying to find a percentage of missing data over multiple variables. I'm only after a specific missing value (.a) which I've used to label unexpected missing. I can use 'tab var1, missing' to see the percentage of '.a' values in a single variable but how do I go about viewing the percentage of '.a' values within the whole dataset? I.e. so that I can report the overall level of missing data.

    Any help much appreciated.

    Matt
    Last edited by Matthew Breckons; 07 Sep 2016, 17:36.

  • #2
    Try this:

    Code:
    local dot_a_count = 0
    ds, has(type numeric)
    local numeric_vars `r(varlist)'
    foreach v of local numeric_vars {
        count if `v' == .a
        local dot_a_count = `dot_a_count' + r(N)
    }
    display "Proportion of .a values among numeric variables: _continue"
    display =`dot_a_count'/(`=_N' * `:word count `numeric_vars'')
    Warning: not tested, beware of typos

    Comment


    • #3
      Thank you for this.

      Do you know if there is a way of getting a single overall figure? A total percentage of missing values within these variables?

      Comment


      • #4
        What Clyde recommends gives you that, doesn't it? At least when you multiply it by 100 to convert proportion to percentage.

        Edited: If you're looking for any missing value, then change
        Code:
        count if `v' == .a
        to
        Code:
        quietly count if missing(`v')
        Last edited by Joseph Coveney; 08 Sep 2016, 00:59.

        Comment


        • #5
          Apologies - you're right - it does, I had misinterpreted the output. Thanks so much, it really is appreciated!

          Could I ask one further question - if I'm trying to narrow this down to obtain the figure for a group of specified variables - which part(s) of the code doI replace with these? So if I'm after the proportion of missing data withing var 1, var 4 & var6?
          Last edited by Matthew Breckons; 08 Sep 2016, 02:29.

          Comment


          • #6
            just list your variable under the ds command

            Comment


            • #7
              Thanks for your help all.

              Comment

              Working...
              X