Dear Colleagues,
I have encoutered a problem when I tried to merge individual stock return data with market return data. Let me illustrate the problem using the following example:
dataset_A: individual stock return data
assume both Jan1 and Jan2 are trading days, but stock 001 does not trade on jan2 therefore its ret data is missing, therefore, the firm-date observation for stock 001 on jan2 is not reported. Similarly, stock 003 also has a missing observation on jan1.
dataset_B: market stock return data
*
I want to get a merged dataset like:
it means that I want to keep all firm-date observations as long as the trddt is available in dataset_B for each stockid.
in stata, I tried to use the following merge command:
use dataset_B
sort trddt
use dataset A
sort trddt
merge m:1 trddt using "dataset_B", keep (match using)
however, I can only get a merged dataset as:
it cannot show up the missing observations for stock 001 and 003. I can figure out the problem in my code, but I really don't know how to correct my code to get the desired results.
Could anyone help me to on this issue?
I really appreciate your kind help and suggstions. Thank you so much!
I have encoutered a problem when I tried to merge individual stock return data with market return data. Let me illustrate the problem using the following example:
dataset_A: individual stock return data
stockid | trddt | ret |
001 | 2019-01-01 | 0.1 |
002 | 2019-01-01 | 0.05 |
002 | 2019-01-02 | 0.06 |
003 | 2019-01-02 | 0.07 |
dataset_B: market stock return data
trddt | market return |
2019-01-01 | 0.03 |
2019-01-02 | 0.04 |
I want to get a merged dataset like:
stockid | trddt | ret | market return |
001 | 2019-01-01 | 0.1 | 0.03 |
001 | 2019-01-02 | . | 0.04 |
002 | 2019-01-01 | 0.05 | 0.03 |
002 | 2019-01-02 | 0.06 | 0.04 |
003 | 2019-01-01 | . | 0.03 |
003 | 2019-01-02 | 0.07 | 0.04 |
in stata, I tried to use the following merge command:
use dataset_B
sort trddt
use dataset A
sort trddt
merge m:1 trddt using "dataset_B", keep (match using)
however, I can only get a merged dataset as:
stockid | trddt | ret | market return |
001 | 2019-01-01 | 0.1 | 0.03 |
002 | 2019-01-01 | 0.05 | 0.03 |
002 | 2019-01-02 | 0.06 | 0.04 |
003 | 2019-01-02 | 0.07 | 0.04 |
Could anyone help me to on this issue?
I really appreciate your kind help and suggstions. Thank you so much!
Comment