Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tracking dropped observation counts

    I am using Robert Picard's excellent project package. I would like to keep track of the number of observations dropped while cleaning and merging my data, along with the reasons.
    I know how to do the counts, but my question is how to keep track and generate a final tally.
    It can be written to the logs, but then I have to manually go and extract them from the logs.
    If I append them to a separate file (e.g. a text file), I will have an issue when one file that drops observations is edited and another file is not, since the non-edited file may not be re-run (if nothing it depends on is changed, which is managed by project), and the counts will get lost.
    I was thinking along the line of a .dta file with a variable for each of the different drop reasons, so that the counts can each be changed individually, but this seems like a very "hacky" solution. Also, the project package doesn't seem to have a way to handle a file that will be changed by more than one do-file.

    I would love to hear any suggestions.
    Thank you!

  • #2
    I typically use notes to record the number of observations dropped and relevant explanations in the modified dataset. As the do-file is run, you can keep a tally of how many observations have been dropped and also add a final count in a note (a good practice would be to double check that the final tally matches the difference between the initial observation count and the final count).

    You are describing a situation where original data undergoes modifications via multiple do-files. I do that all the time but it would be quite illogical for a subsequent do-file to overwrite what was created by an earlier do-file. If you are trying to append annotations created by multiple do-files, save each annotation set separately (one per do-file) and then append them in a separate do-file to produce a final count.

    Comment


    • #3
      trackobs (SSC) is along the lines of notes. It does not store explanations and I have no idea whether and how it might be combined with project.

      Best
      Daniel

      Comment


      • #4
        Thank you both for those suggestions!
        I'm going to look into both of those options now.

        Comment

        Working...
        X