Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Deleting Observations w/ Missing Values on Particular Variables

    Hello all!

    I have a data set with demographics first, followed by two scales each 12 variables long. Each scale has missing data randomly throughout.

    I'd like to delete observations that are missing (all) on both scales (24 variables).

    I've used:

    Code:
    generate no_miss = !missing(OEE2Y-OEE31X)
    to identify observations that have missing on the specified variables, but this code identifies observations (=1) that have any missing data at all.

    Is there a code that would allow me to either identify observations that have missing on all 24 variables (much like using egen) or code that would find and deleted observations for which all 24 variables are missing.

    Thank you!

    Morgan

  • #2
    Morgan
    assuming that you have already checked that the missingness of your data is completely at random (otherwise you will probably deal with a biased sample and your statistiscs will be biased as well), you may want to try something along the following lines:
    Code:
    set obs 10
    g id=_n
    g A=2+_n in 1/5
    g B=A+_n in 1/5
    egen flag=group(A-B)
    drop if flag==.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Carlo,

      Thank you for your response. Perhaps I was not clear. I have partially missing scale responses that I'd like to keep and contribute to future analyses. I am only interested in removing observation that are missing on all survey variables.

      So, for instance, consider the data set as a scale consisting of 4 variables:

      Code:
      set obs 10
      g id=_n
      g A=2+_n in 1/5
      g B=A+_n 
      g C=_n+3 in 1/9
      g D=_n+4 in 1/6
      I'd like to identify and remove only observation 10, the one with all variables missing.

      It seems that this:

      Code:
      egen flag=group(A-D)
      Identifies if any of the variables in the A-D syntax are missing. I am only concerned with identifying observations that are missing on all variables in question.

      Thank you!

      Comment


      • #4
        Morgan:
        you may want to try something along the following lines:
        Code:
        set obs 10
        g id=_n
        g A=2+_n in 1/5
        g B=A+_n
        g C=_n+3 in 1/9
        g D=_n+4 in 1/6
        egen flag=rowmiss(A-D)
        drop if flag==4
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Try this:
          Code:
          egen nmcount = rownonmiss(OEE2Y-OEE31X)
          drop if nmcount == 0
          Added: Crossed with #4 which provides a related, but somewhat different solution.

          Comment


          • #6
            Thank you both so much! This works well.

            Comment

            Working...
            X