Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Documentation of all cases of deletion of observations

    Hello, all Statalists!
    I run a long do file and I want to create a table (on excel or word) that documents all the deletions in the process.
    Here is a subset of the dataset:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long id byte(var2 var1)
        1 10 5
        1 11 5
        1 11 5
        1 11 5
        4 11 5
        4 11 5
        4 11 5
        4 11 5
        8 11 5
        8 11 5
        8 11 5
        8 11 5
        9 12 5
        9 12 5
        9 12 5
        9 12 5
      101  9 5
      101 10 5
      101 10 5
    31766  9 8
    31766  9 8
    31766  9 8
    end

    In addition, these are two drop commands I use:
    Code:
    keep if var1==5
    drop if var2>12 | var2<10
    In this case, the table looks like this:
    Step Command Observations Deleted IDs Deleted Observations IDs
    0 22 6
    1 keep if var1==5 3 1 19 5
    2 drop if var2>12 | var2<10 1 0 18 5

    Again, the do file I use contains of many drop/keep commands. Thus, I am looking for a solution that creates this long table automatically.

    Many Thanks!

  • #2
    I don't see an obvious way of doing this, except that you keep a copy of the dataset before executing a keep or drop command and compare it with the dataset after executing the command. This may be through succesive merges specifying the option -keep(using)- as the dataset after executing the command (master) will coincide with the matches in the dataset prior to executing the command.

    Comment


    • #3
      Not exactly elegant, but you should be able to extract the info you need:

      Code:
      clear all
      sysuse auto
      gen id = _n
      expand rep78
      
      capture program drop logthis
      program define logthis, rclass
      
      capture drop temp_tag
      egen temp_tag = tag(id)
      quietly sum temp_tag, det
      return scalar uid_0 = r(sum)
      
      `1'
      return local command = "`1'"
      return scalar n_drop = r(N_drop)
      
      capture drop temp_tag
      egen temp_tag = tag(id)
      quietly sum temp_tag, det
      return scalar uid_1 = r(sum)
      
      quietly count
      scalar c = r(N)
      return scalar n_new  = r(N)
      
      end
      
      logthis "drop if mpg > 30 & mpg < ."
      return list
      
      logthis "keep if foreign == 1"
      return list
      Results:
      Code:
      . logthis "drop if mpg > 30 & mpg < ."
      (25 observations deleted)
      
      . return list
      
      scalars:
                    r(n_new) =  215
                    r(uid_1) =  69
                   r(n_drop) =  25
                    r(uid_0) =  74
      
      macros:
                  r(command) : "drop if mpg > 30 & mpg < ."
      
      .
      . logthis "keep if foreign == 1"
      (144 observations deleted)
      
      . return list
      
      scalars:
                    r(n_new) =  71
                    r(uid_1) =  18
                   r(n_drop) =  144
                    r(uid_0) =  69
      
      macros:
                  r(command) : "keep if foreign == 1"

      Comment


      • #4
        For a similar problem, see trackobs (SSC).

        Comment

        Working...
        X