Dear all
A background on the data:
I have data from (IBES) UNADJUSTED HISTORY that reports earnings per share. Earnings per share in IBES has been used as a measure for "street" earnings because it highly corresponds with the company earnings announcement (Dyloe et al. 2003)
In my data I have the following variables:
cusip(firm identifier) eps(earnings per share as a measure of street earnings) fpedats (fiscal year end to which the estimate corresponds to) anndats( announcement date in press release)
the data entries looks like:
cusip eps fpedats anndats
11111111 0.2 31dec1988 07sep1987
11111111 1.2 31dec1988 02jan1988
11111111 0.3 31dec1988 05mar1988
11111111 1.1 31dec1988 09may1988
11111111 0.23 31dec1988 02feb1989
11111111 0.31 31dec1988 03feb1989
11111111 1.8 31dec1988 11nov1988
In my example, it is obvious that a firm with a fiscal year-end 31december 1988 can still have earnings estimates after 31 december 1988.
The most recent eps for the fiscal year end 31 december 1988 is at 03 feb1989 (perhaps before the firm files its annual report for 1988 which can be sometimes 3 months after the year end)
The problem:
I want to keep only the latest eps value for each firm and fiscal year end.
I have done the following:
use street-ibes.dta,clear
compress
drop if trim(cusip)=="" // to drop missing cusip
drop if cusip=="00000000"
bysort cusip fpedats : gen seq=_n // This creates the first problem in the data as it actually generated seq , but all anndats becomes messy (not sorted). For example :
seq
11111111 0.2 31dec1988 03feb1989 1
11111111 1.2 31dec1988 11nov1988 2
11111111 0.3 31dec1988 05mar1988 3
11111111 1.1 31dec1988 07sep1987 4
11111111 0.23 31dec1988 02feb1989 5
11111111 0.31 31dec1988 02jan1988 6
11111111 1.8 31dec1988 09may1988 7
*I tried to solve this issue by:
drop seq
sort cusip anndats // this gives the correct sort
bysort cusip anndats: gen seq_n // this ofcourse gives the wrong sequence, because the aim is to create the sequence within the fiscal year!
***My understaing is that , if this previous issue is solved, I should proceed as following:
bysort cusip fpedats: gen maximum=_N
keep if seq==maximum
Can anyone help with this issue , please ?
I attach my data set so you can see the problem clearly, though I believe I explained it in details here !
Thanks in advance
A background on the data:
I have data from (IBES) UNADJUSTED HISTORY that reports earnings per share. Earnings per share in IBES has been used as a measure for "street" earnings because it highly corresponds with the company earnings announcement (Dyloe et al. 2003)
In my data I have the following variables:
cusip(firm identifier) eps(earnings per share as a measure of street earnings) fpedats (fiscal year end to which the estimate corresponds to) anndats( announcement date in press release)
the data entries looks like:
cusip eps fpedats anndats
11111111 0.2 31dec1988 07sep1987
11111111 1.2 31dec1988 02jan1988
11111111 0.3 31dec1988 05mar1988
11111111 1.1 31dec1988 09may1988
11111111 0.23 31dec1988 02feb1989
11111111 0.31 31dec1988 03feb1989
11111111 1.8 31dec1988 11nov1988
In my example, it is obvious that a firm with a fiscal year-end 31december 1988 can still have earnings estimates after 31 december 1988.
The most recent eps for the fiscal year end 31 december 1988 is at 03 feb1989 (perhaps before the firm files its annual report for 1988 which can be sometimes 3 months after the year end)
The problem:
I want to keep only the latest eps value for each firm and fiscal year end.
I have done the following:
use street-ibes.dta,clear
compress
drop if trim(cusip)=="" // to drop missing cusip
drop if cusip=="00000000"
bysort cusip fpedats : gen seq=_n // This creates the first problem in the data as it actually generated seq , but all anndats becomes messy (not sorted). For example :
seq
11111111 0.2 31dec1988 03feb1989 1
11111111 1.2 31dec1988 11nov1988 2
11111111 0.3 31dec1988 05mar1988 3
11111111 1.1 31dec1988 07sep1987 4
11111111 0.23 31dec1988 02feb1989 5
11111111 0.31 31dec1988 02jan1988 6
11111111 1.8 31dec1988 09may1988 7
*I tried to solve this issue by:
drop seq
sort cusip anndats // this gives the correct sort
bysort cusip anndats: gen seq_n // this ofcourse gives the wrong sequence, because the aim is to create the sequence within the fiscal year!
***My understaing is that , if this previous issue is solved, I should proceed as following:
bysort cusip fpedats: gen maximum=_N
keep if seq==maximum
Can anyone help with this issue , please ?
I attach my data set so you can see the problem clearly, though I believe I explained it in details here !
Thanks in advance
Comment