a m:1 merge problem

Paul Pang

Join Date: Jan 2020

Posts: 2
#1

a m:1 merge problem

16 Jan 2020, 09:49

Dear Colleagues,

I have encoutered a problem when I tried to merge individual stock return data with market return data. Let me illustrate the problem using the following example:

dataset_A: individual stock return data
stockid trddt ret

001 2019-01-01 0.1

002 2019-01-01 0.05

002 2019-01-02 0.06

003 2019-01-02 0.07

assume both Jan1 and Jan2 are trading days, but stock 001 does not trade on jan2 therefore its ret data is missing, therefore, the firm-date observation for stock 001 on jan2 is not reported. Similarly, stock 003 also has a missing observation on jan1.

dataset_B: market stock return data
trddt market return

2019-01-01 0.03

2019-01-02 0.04

*

I want to get a merged dataset like:
stockid trddt ret market return

001 2019-01-01 0.1 0.03

001 2019-01-02 . 0.04

002 2019-01-01 0.05 0.03

002 2019-01-02 0.06 0.04

003 2019-01-01 . 0.03

003 2019-01-02 0.07 0.04

it means that I want to keep all firm-date observations as long as the trddt is available in dataset_B for each stockid.

in stata, I tried to use the following merge command:

use dataset_B
sort trddt

use dataset A
sort trddt

merge m:1 trddt using "dataset_B", keep (match using)

however, I can only get a merged dataset as:
stockid trddt ret market return

001 2019-01-01 0.1 0.03

002 2019-01-01 0.05 0.03

002 2019-01-02 0.06 0.04

003 2019-01-02 0.07 0.04

it cannot show up the missing observations for stock 001 and 003. I can figure out the problem in my code, but I really don't know how to correct my code to get the desired results.

Could anyone help me to on this issue?

I really appreciate your kind help and suggstions. Thank you so much!

Last edited by Paul Pang; 16 Jan 2020, 09:51.
Tags: None
Paul Pang

Join Date: Jan 2020

Posts: 2
#2

16 Jan 2020, 10:42

I may figure out a possible way to solve this issue:

first I use -fillin- to create those missing observations in dataset_A：

fillin stockid traddt

then perform the m:1 merge

this method works but it also generates quite a lot of redundant missing observatoins, for example, if firm A is listed on 2000/1/1 and firm B is listed on 2018/1/1, then it will generates missing observations for firm B from 2000/1/1 to 2017/12/31.

therefore, any more efficient method is still appreciated!

Thank you.

Last edited by Paul Pang; 16 Jan 2020, 10:53.
Comment

stockid	trddt	ret
001	2019-01-01	0.1
002	2019-01-01	0.05
002	2019-01-02	0.06
003	2019-01-02	0.07

trddt	market return
2019-01-01	0.03
2019-01-02	0.04

stockid	trddt	ret	market return
001	2019-01-01	0.1	0.03
001	2019-01-02	.	0.04
002	2019-01-01	0.05	0.03
002	2019-01-02	0.06	0.04
003	2019-01-01	.	0.03
003	2019-01-02	0.07	0.04

stockid	trddt	ret	market return
001	2019-01-01	0.1	0.03
002	2019-01-01	0.05	0.03
002	2019-01-02	0.06	0.04
003	2019-01-02	0.07	0.04

Announcement

a m:1 merge problem

Comment