tagging nonunique variables if they were present in an old version of a dataset (i.e. to see what has changed)

Matias Iberico

Join Date: Jul 2016

Posts: 4
#1

tagging nonunique variables if they were present in an old version of a dataset (i.e. to see what has changed)

08 Dec 2016, 05:20

I have two versions of a dataset that I am trying to compare.
The ID variable is split into Village, Concession, Menage, Woman, Child components

To illustrate this represents the dataset :
Village Concession Menage Woman Child

01 09 01 01 01

01 09 01 01 02

01 09 02 02 01

01 10 01 01 01

01 10 01 01 02

01 11 01 01 01

02 09 01 01 01

02 09 01 01 02

02 13 01 01 01

I want to see if there are any new concession that have been added to the most recent version of the database (Village and Concession parts of ID variable would uniquely identify this). I have no problem generating a database of just the unique Village and Concession variables present in the old dataset version.

I took the old dataset, stripped out the menage, woman and child components and then removed all duplicates:

Village Concession

01 09

01 10

01 11

02 09

02 13

However when I try to do a m:1 merge to compare the new dataset version (with all original observations) with the old dataset (stripped of menage, woman, child to yield only the concessions present) I have a problem because the Village Concession variable is not unique in the new version and Stata does not let me proceed. I'm sure there is a simple solution to this, I just don't know it! Any ideas as to how to go about this?

-Matias
Tags: None
Mike Lacy

Join Date: Apr 2014

Posts: 2449
#2

08 Dec 2016, 09:13

What if you just dropped the duplicates of village/concession in both new and old files, and then merged the new file to the old with the "new" as the using file. Any observation in the combined file that -merge- identified as "appeared in using only" (see -help merge-) would be your observations of interest, to my understanding.
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4540
#3

08 Dec 2016, 15:59

Originally posted by Matias Iberico View Post

However when I try to do a m:1 merge to compare the new dataset version (with all original observations) with the old dataset (stripped of menage, woman, child to yield only the concessions present) I have a problem because the Village Concession variable is not unique in the new version and Stata does not let me proceed. I'm sure there is a simple solution to this, I just don't know it! Any ideas as to how to go about this?

Reverse the cardinality.

From

Code:

merge m:1 Village Concession using new

to

Code:

merge 1:m Village Concession using new
Comment
Matias Iberico

Join Date: Jul 2016

Posts: 4
#4

30 Mar 2023, 18:37

Years late, but thank you! That did the trick Joseph Coveney
-Matias
Comment

Village	Concession	Menage	Woman	Child
01	09	01	01	01
01	09	01	01	02
01	09	02	02	01
01	10	01	01	01
01	10	01	01	02
01	11	01	01	01
02	09	01	01	01
02	09	01	01	02
02	13	01	01	01

Village	Concession
01	09
01	10
01	11
02	09
02	13

Announcement

tagging nonunique variables if they were present in an old version of a dataset (i.e. to see what has changed)

Comment

Comment

Comment