Hello All,
As you can see in the title I have some issues to merge the two datasets correctly. I basically have a similar problem as “Shaquille” who postet a similar question a few days ago. My question is a bit different since I try to merge CRSP/Compustat Merged (CCM) monthly (not daily as in the recent post) stock returns with Compustat annual fundamental data. However, after studying the recent post I got stuck and I hoping that you can give me some advice.
First of all: I did manage to merge these two datasets on the identifiers DATADATE and GVKEY code. Afterwards I dropped all duplicates which had been in the dataset.
But: I can´t determine if I did it correctly. The code it posted below. Any Advice on that? It worked, but I don’t know if it makes sense. Or have I done bullshit?
The second problem is to match each annual fundamental observation with the correct monthly return. The returns are monthly, but I want them to be annual, so they match with the fundamental data. Do you have any advice on that?
There some information about the datasets.
If you need any further information on the problem, just let me know!
THANKYOU!
The merged data sets looks like the following:
acronyms:
fyear: data date fiscal year
prcc_c: Price Close - Annual – Calendar
prcc_f: Price Close - Annual – fiscal
prccm: Price Close – monthly
trt1m: monthly total return
csho: Common Shares Outstanding
Gvkey datadate fyear company name cusip prcc_f prcc_c trt1m csho
Xxx 31may1972 1971 AAR Corp. xxx 22 32 -5.7 1000
Stata Code to merge the datasets and to eliminate duplicates:
use "CCM_monthlyse_0822_LPERMNO.dta"
sort gvkey datadate
save, replace
clear
*Merge with CCM und CT*
use "CT_0822_GVKEY.dta"
sort gvkey datadate
merge gvkey datadate using "CCM_monthlyse_0822_LPERMNO.dta"
tab _merge
drop if _merge==1
drop if _merge==2
destring gvkey, replace
xtset gvkey datadate
sort gvkey datadate
*delete duplikates*
desc
summ datadate, format
duplicates tag gvkey datadate, gen(dup)
tab dup
preserve
drop if dup==1
restore
order gvkey datadate dup
preserve
gen keeper=0
levelsof gvkey if dup==1, local(levels)
foreach x of local levels {
replace keeper=1 if ´x' == gvkey
}
tab dup
keep if dup==0
tab dup
xtset gvkey datadate
As you can see in the title I have some issues to merge the two datasets correctly. I basically have a similar problem as “Shaquille” who postet a similar question a few days ago. My question is a bit different since I try to merge CRSP/Compustat Merged (CCM) monthly (not daily as in the recent post) stock returns with Compustat annual fundamental data. However, after studying the recent post I got stuck and I hoping that you can give me some advice.
First of all: I did manage to merge these two datasets on the identifiers DATADATE and GVKEY code. Afterwards I dropped all duplicates which had been in the dataset.
But: I can´t determine if I did it correctly. The code it posted below. Any Advice on that? It worked, but I don’t know if it makes sense. Or have I done bullshit?
The second problem is to match each annual fundamental observation with the correct monthly return. The returns are monthly, but I want them to be annual, so they match with the fundamental data. Do you have any advice on that?
There some information about the datasets.
If you need any further information on the problem, just let me know!
THANKYOU!
The merged data sets looks like the following:
acronyms:
fyear: data date fiscal year
prcc_c: Price Close - Annual – Calendar
prcc_f: Price Close - Annual – fiscal
prccm: Price Close – monthly
trt1m: monthly total return
csho: Common Shares Outstanding
Gvkey datadate fyear company name cusip prcc_f prcc_c trt1m csho
Xxx 31may1972 1971 AAR Corp. xxx 22 32 -5.7 1000
Stata Code to merge the datasets and to eliminate duplicates:
use "CCM_monthlyse_0822_LPERMNO.dta"
sort gvkey datadate
save, replace
clear
*Merge with CCM und CT*
use "CT_0822_GVKEY.dta"
sort gvkey datadate
merge gvkey datadate using "CCM_monthlyse_0822_LPERMNO.dta"
tab _merge
drop if _merge==1
drop if _merge==2
destring gvkey, replace
xtset gvkey datadate
sort gvkey datadate
*delete duplikates*
desc
summ datadate, format
duplicates tag gvkey datadate, gen(dup)
tab dup
preserve
drop if dup==1
restore
order gvkey datadate dup
preserve
gen keeper=0
levelsof gvkey if dup==1, local(levels)
foreach x of local levels {
replace keeper=1 if ´x' == gvkey
}
tab dup
keep if dup==0
tab dup
xtset gvkey datadate
Comment