Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • a m:1 merge problem

    Dear Colleagues,

    I have encoutered a problem when I tried to merge individual stock return data with market return data. Let me illustrate the problem using the following example:

    dataset_A: individual stock return data
    stockid trddt ret
    001 2019-01-01 0.1
    002 2019-01-01 0.05
    002 2019-01-02 0.06
    003 2019-01-02 0.07
    assume both Jan1 and Jan2 are trading days, but stock 001 does not trade on jan2 therefore its ret data is missing, therefore, the firm-date observation for stock 001 on jan2 is not reported. Similarly, stock 003 also has a missing observation on jan1.

    dataset_B: market stock return data
    trddt market return
    2019-01-01 0.03
    2019-01-02 0.04
    *

    I want to get a merged dataset like:
    stockid trddt ret market return
    001 2019-01-01 0.1 0.03
    001 2019-01-02 . 0.04
    002 2019-01-01 0.05 0.03
    002 2019-01-02 0.06 0.04
    003 2019-01-01 . 0.03
    003 2019-01-02 0.07 0.04
    it means that I want to keep all firm-date observations as long as the trddt is available in dataset_B for each stockid.


    in stata, I tried to use the following merge command:

    use dataset_B
    sort trddt

    use dataset A
    sort trddt

    merge m:1 trddt using "dataset_B", keep (match using)

    however, I can only get a merged dataset as:
    stockid trddt ret market return
    001 2019-01-01 0.1 0.03
    002 2019-01-01 0.05 0.03
    002 2019-01-02 0.06 0.04
    003 2019-01-02 0.07 0.04
    it cannot show up the missing observations for stock 001 and 003. I can figure out the problem in my code, but I really don't know how to correct my code to get the desired results.

    Could anyone help me to on this issue?

    I really appreciate your kind help and suggstions. Thank you so much!
    Last edited by Paul Pang; 16 Jan 2020, 09:51.

  • #2
    I may figure out a possible way to solve this issue:

    first I use -fillin- to create those missing observations in dataset_A:

    fillin stockid traddt

    then perform the m:1 merge

    this method works but it also generates quite a lot of redundant missing observatoins, for example, if firm A is listed on 2000/1/1 and firm B is listed on 2018/1/1, then it will generates missing observations for firm B from 2000/1/1 to 2017/12/31.

    therefore, any more efficient method is still appreciated!

    Thank you.
    Last edited by Paul Pang; 16 Jan 2020, 10:53.

    Comment

    Working...
    X