Dear Stata experts,
I have been working on a STATA problem for the last two days now and I just cannot find out what I am doing wrong. Maybe you have the solution to my problem:
Overall I am trying to merge four data sets into one.
The first set of data is called "Kogan" and contains "permno" and "fyear" and some other data that interests us.
The second set of data is called "LPERMNO & Standard and Poors" and contains "permno" and "gvkey" data. I merged this data with the first set in order to get Kogan data with a gvkey as an identifier. This worked and we created a new dataset with "gvkey" "and "fyear".
The third set of data is called "Variablen" and the last one "MA ability". And this is where the problem starts: we cannot manage to merge these two new datasets into other merged dataset. I already dropped duplicates, renamed "_merge" and did some other things as you can see in the code. However, I always get an error message when doing 2.3) (the code before 2.3) runs fine, though). The error message says "Variables GVKEY fyear do not uniquely identify observations in the master data" which is weird, since both data sets that I want to merge definitely contain "GVKEY" and "fyear".
If you have any idea how to solve this problem, PLEASE, PLEASE tell me. Thank you so much in advance!!
1) Prepare datasets for Merge -------------------------------------------
*rename Standard and Poor's identifier in gvkey-Form
*1.1) Prepare datasets for merge (Kogan)
use "Kogan Patent data.dta", clear
duplicates tag permno fyear, gen(dup)
drop if dup==1
save "Kogan_v05.dta"
*0 obersavtions deleted
*1.2) Prepare datasets for merge (LPERMNO & Standard and Poors)
use "LPERMNO & Standard and Poors Merging Dataset.dta", clear
duplicates tag LPERMNO fyear, gen(dup)
drop if dup==1
*Varibale LPERMNO in permno
rename LPERMNO permno
save "LPERMNO_v05.dta", replace
*14 observations dropped
*1.3) Prepare datasets for merge (Variablen)
use "Variablen.dta", clear
duplicates tag GVKEY fyear, gen(dup)
drop if dup==1
save "Variablen_v05.dta", replace
*0 Obersvations dropped
*1.4) Prepare datasets for merge (MA ability)
use "ma_score_edited.dta", clear
duplicates tag gvkey fyear, gen(dup)
drop if dup==1
save "ma_score_v05.dta", replace
*400 Obervations dropped
*2) Merge Datasets--------------------------------------------------------------
*2.1) Merge Kogan Dataset with LPERMNO & Standard and Poors
*2.1.1) Open prepared Kogan dataset
use "Kogan_v05.dta", clear
*2.1.2) Merge with LPERMNO & Standard and Poors datasets
merge 1:1 permno fyear using "LPERMNO_v05.dta"
rename _merge execmatch01
save "LPERMNO&Kogan_v05.dta"
*2.2) Open LPERMNO & Standard and Poors datasets
use "LPERMNO&Kogan_v05.dta", clear
duplicates tag GVKEY fyear, gen(dupl)
drop if dupl==1
save "LPERMNO&Kogan_v05a.dta", replace
*4922 observations dropped
*2.3) Merge with prepared Variablen Dataset
use "LPERMNO&Kogan_v05a.dta", clear
merge 1:1 GVKEY fyear using "Variablen_v05.dta"
rename _merge execmatch02
rename GVKEY gvkey
*error message: Variables GVKEY fyear do not uniquely identify observations in the master data*
*2.4) Merge mit MA-Ability datasets
merge 1:1 gvkey fyear using "ma_score_v05.dta", keep(match)
rename _merge execmatch03
save "final_Merged_Data_v04.dta"
I have been working on a STATA problem for the last two days now and I just cannot find out what I am doing wrong. Maybe you have the solution to my problem:
Overall I am trying to merge four data sets into one.
The first set of data is called "Kogan" and contains "permno" and "fyear" and some other data that interests us.
The second set of data is called "LPERMNO & Standard and Poors" and contains "permno" and "gvkey" data. I merged this data with the first set in order to get Kogan data with a gvkey as an identifier. This worked and we created a new dataset with "gvkey" "and "fyear".
The third set of data is called "Variablen" and the last one "MA ability". And this is where the problem starts: we cannot manage to merge these two new datasets into other merged dataset. I already dropped duplicates, renamed "_merge" and did some other things as you can see in the code. However, I always get an error message when doing 2.3) (the code before 2.3) runs fine, though). The error message says "Variables GVKEY fyear do not uniquely identify observations in the master data" which is weird, since both data sets that I want to merge definitely contain "GVKEY" and "fyear".
If you have any idea how to solve this problem, PLEASE, PLEASE tell me. Thank you so much in advance!!
1) Prepare datasets for Merge -------------------------------------------
*rename Standard and Poor's identifier in gvkey-Form
*1.1) Prepare datasets for merge (Kogan)
use "Kogan Patent data.dta", clear
duplicates tag permno fyear, gen(dup)
drop if dup==1
save "Kogan_v05.dta"
*0 obersavtions deleted
*1.2) Prepare datasets for merge (LPERMNO & Standard and Poors)
use "LPERMNO & Standard and Poors Merging Dataset.dta", clear
duplicates tag LPERMNO fyear, gen(dup)
drop if dup==1
*Varibale LPERMNO in permno
rename LPERMNO permno
save "LPERMNO_v05.dta", replace
*14 observations dropped
*1.3) Prepare datasets for merge (Variablen)
use "Variablen.dta", clear
duplicates tag GVKEY fyear, gen(dup)
drop if dup==1
save "Variablen_v05.dta", replace
*0 Obersvations dropped
*1.4) Prepare datasets for merge (MA ability)
use "ma_score_edited.dta", clear
duplicates tag gvkey fyear, gen(dup)
drop if dup==1
save "ma_score_v05.dta", replace
*400 Obervations dropped
*2) Merge Datasets--------------------------------------------------------------
*2.1) Merge Kogan Dataset with LPERMNO & Standard and Poors
*2.1.1) Open prepared Kogan dataset
use "Kogan_v05.dta", clear
*2.1.2) Merge with LPERMNO & Standard and Poors datasets
merge 1:1 permno fyear using "LPERMNO_v05.dta"
rename _merge execmatch01
save "LPERMNO&Kogan_v05.dta"
*2.2) Open LPERMNO & Standard and Poors datasets
use "LPERMNO&Kogan_v05.dta", clear
duplicates tag GVKEY fyear, gen(dupl)
drop if dupl==1
save "LPERMNO&Kogan_v05a.dta", replace
*4922 observations dropped
*2.3) Merge with prepared Variablen Dataset
use "LPERMNO&Kogan_v05a.dta", clear
merge 1:1 GVKEY fyear using "Variablen_v05.dta"
rename _merge execmatch02
rename GVKEY gvkey
*error message: Variables GVKEY fyear do not uniquely identify observations in the master data*
*2.4) Merge mit MA-Ability datasets
merge 1:1 gvkey fyear using "ma_score_v05.dta", keep(match)
rename _merge execmatch03
save "final_Merged_Data_v04.dta"
Comment