I have a difficult merging task to do, and I would like your help on this.
So I have two data sets:
1. Mutual Fund Holdings quarterly data; (master dataset)
I need three variables to uniquely identify each observation in the data: wficn cusip yrqt;
wficn: Wharton Financial Institution Center Number (WFICN), which is a unique and permanent fund portfolio identifier.
cusip: The holdings are identified by CUSIP
yrqt: such as 1998q1, 1998q2, etc..
for each quarter, there are many funds, each of which has many stock holdings.
2. CRSP monthly stocks data-unbalanced panel data (using dataset)
cusip: stock identifier
yrm: time/date variable, such as 1998m1, 1998m2...1998m12, etc....
TASK: merge these two datasets
What I really want (the ideal merge result) is:
assume the equity holdings for fund i is the same between 1998q1, and 1998q2; that is assume the fund portfolio i holds the same set of equities in month 1998 April, May as in 1998 June. What can I do about the first dataset to achieve this ?
What I actually compromise is to convert the 2nd dataset to quarterly level;
then use m:1 merge using cusip yrqt as the key.
So I have two data sets:
1. Mutual Fund Holdings quarterly data; (master dataset)
I need three variables to uniquely identify each observation in the data: wficn cusip yrqt;
wficn: Wharton Financial Institution Center Number (WFICN), which is a unique and permanent fund portfolio identifier.
cusip: The holdings are identified by CUSIP
yrqt: such as 1998q1, 1998q2, etc..
for each quarter, there are many funds, each of which has many stock holdings.
2. CRSP monthly stocks data-unbalanced panel data (using dataset)
cusip: stock identifier
yrm: time/date variable, such as 1998m1, 1998m2...1998m12, etc....
TASK: merge these two datasets
What I really want (the ideal merge result) is:
assume the equity holdings for fund i is the same between 1998q1, and 1998q2; that is assume the fund portfolio i holds the same set of equities in month 1998 April, May as in 1998 June. What can I do about the first dataset to achieve this ?
What I actually compromise is to convert the 2nd dataset to quarterly level;
then use m:1 merge using cusip yrqt as the key.
Comment