Using -drop _merge- command

Jessica Choi

Join Date: Nov 2019

Posts: 15
#1

Using -drop _merge- command

28 Nov 2019, 12:23

Hello,

I have merged two datasets (double entry) with the same variables, and to distinguish them, I had originally added suffixes a_ and b_ to each variable for dataset 1 and dataset 2, respectively.

I am now trying to use the -drop _merge- command, but STATA gives me an error message saying: "variable _merge not found" (r111). What can be a problem here?

Thank you for your help.
Tags: None
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2403
#2

28 Nov 2019, 12:54

Stata has quite helpfully indicated the problem that there is no variable named -_merge-. Scrutinize the command which you use to -merge- to see if you have activated the -nogen- option or -generate()- option, or if -_merge- is dropped sometime after merging.
Comment
Jessica Choi

Join Date: Nov 2019

Posts: 15
#3

28 Nov 2019, 15:12

Hello Leonardo,

Thank you for your help. I was able to somehow make -drop _merge- happen, but I still see the suffixes a_ and b_ in the dataset. Is there a way to remove these suffixes so that there would be one value under each variable and not duplicates?
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2403
#4

28 Nov 2019, 19:15

Hi Jessica,

The short answer is that I can imagine multiple ways to answer your question, but the specifics greatly depend on your data at hand. Please do read the FAQ (linked in the top left corner) and pay attention to #12 - how to ask a question and provide a data example using -dataex-. Then paste back example data with a clear description of what you want your final dataset to look like.
Comment

Jessica Choi

Join Date: Nov 2019
Posts: 15

29 Nov 2019, 05:35

Hello Leonardo,

Thank you for pointing me to the FAQ. I am attaching an example data below:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte(a_q11 a_q12 a_q13 a_q14 b_q11 b_q12 b_q13 b_q14)
1 1 2 1 1 1 2 1
1 2 2 2 1 2 2 2
2 1 1 1 2 1 1 1
3 3 3 3 3 3 3 3
4 5 3 3 4 5 3 3
3 3 2 2 3 3 2 2
1 2 1 1 1 2 1 1
5 5 6 5 5 5 6 5
1 1 2 2 1 1 2 2
3 3 4 3 3 3 4 3
4 3 3 . 4 3 3 .
4 3 2 3 4 3 2 3
2 2 2 3 2 2 2 3
3 1 2 3 3 1 2 3
2 2 1 2 2 2 1 2
4 3 3 3 4 3 3 3
3 3 3 4 3 3 3 4
3 3 5 3 3 3 5 3
4 4 4 5 4 4 4 5
4 3 3 3 4 3 3 3
4 5 6 1 4 5 6 1
3 2 1 2 3 2 1 2
2 1 1 2 2 1 1 2
5 4 4 3 5 4 4 3
3 2 3 3 3 2 3 3
3 2 2 2 3 2 2 2
3 3 2 2 3 3 2 2
3 3 2 3 3 3 2 3
3 2 1 2 3 2 1 2
3 3 3 3 3 3 3 3
3 3 3 3 3 3 3 3
. . . . . . . .
3 5 3 4 3 5 3 4
end

I am hoping to remove the suffixes a_ and b_ so that the final dataset would have variables labeled q_11 q_12 q_13 q_14... and so forth without having them show the same data twice.

Thank you for your time and help!

Comment

Rich Goldstein

Join Date: Mar 2014

Posts: 4466
#6

29 Nov 2019, 06:47

I'm not completely sure I understand but it looks like you can just "drop a_*" (or b_*) and rename the others following "rule" 6 in the help file; e.g.;

Code:

ren b_* *
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2403
#7

29 Nov 2019, 07:26

Rich has provided good advice on wildcard use with -rename-.

It seems like you are doing or have done double-entry data validation. Which is it? If the data have already been validated and corrected, then you can simply follow Rich's advice to drop one set of prefixed variables (say a_*) and rename the other.

If you still need to check which data have been entered correctly, that is a different matter. You could begin to explore this by, say, listing variables that don't match, one at a time.

Code:

* only list observations if q11 doesn't match between persons A and B. list if a_q11 != b_q11
Comment
Jessica Choi

Join Date: Nov 2019

Posts: 15
#8

29 Nov 2019, 15:18

Hello Rich and Leonardo,

Thank you so much for your help! The codes that Rich provided worked perfectly! Also learned a new way to compare the data for each variable from Leonardo!

Happy belated Thanksgiving to you both.
Comment

Announcement

Using -drop _merge- command

Comment

Comment

Comment

Comment

Comment

Comment

Comment