Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • tagging nonunique variables if they were present in an old version of a dataset (i.e. to see what has changed)

    I have two versions of a dataset that I am trying to compare.
    The ID variable is split into Village, Concession, Menage, Woman, Child components

    To illustrate this represents the dataset :
    Village Concession Menage Woman Child
    01 09 01 01 01
    01 09 01 01 02
    01 09 02 02 01
    01 10 01 01 01
    01 10 01 01 02
    01 11 01 01 01
    02 09 01 01 01
    02 09 01 01 02
    02 13 01 01 01
    I want to see if there are any new concession that have been added to the most recent version of the database (Village and Concession parts of ID variable would uniquely identify this). I have no problem generating a database of just the unique Village and Concession variables present in the old dataset version.

    I took the old dataset, stripped out the menage, woman and child components and then removed all duplicates:
    Village Concession
    01 09
    01 10
    01 11
    02 09
    02 13

    However when I try to do a m:1 merge to compare the new dataset version (with all original observations) with the old dataset (stripped of menage, woman, child to yield only the concessions present) I have a problem because the Village Concession variable is not unique in the new version and Stata does not let me proceed. I'm sure there is a simple solution to this, I just don't know it! Any ideas as to how to go about this?

    -Matias

  • #2
    What if you just dropped the duplicates of village/concession in both new and old files, and then merged the new file to the old with the "new" as the using file. Any observation in the combined file that -merge- identified as "appeared in using only" (see -help merge-) would be your observations of interest, to my understanding.

    Comment


    • #3
      Originally posted by Matias Iberico View Post
      However when I try to do a m:1 merge to compare the new dataset version (with all original observations) with the old dataset (stripped of menage, woman, child to yield only the concessions present) I have a problem because the Village Concession variable is not unique in the new version and Stata does not let me proceed. I'm sure there is a simple solution to this, I just don't know it! Any ideas as to how to go about this?
      Reverse the cardinality.

      From
      Code:
      merge m:1 Village Concession using new
      to
      Code:
      merge 1:m Village Concession using new

      Comment


      • #4
        Years late, but thank you! That did the trick Joseph Coveney
        -Matias

        Comment

        Working...
        X