Hi Statalisters,
I'm relatively new to Stata.
I need your help with some coding in stata. See -dataex- example below.
I have daily data on the closing stock price of all firms in the CRSP data base from 2006-2017.
Right now, my data is sorted on multiple shrcd permco bdate.
Using the following code, I created the multiple variable that indicates firm's that have duplicate observations on a specific day, i.e. they have multiple share classes trading.
I want to have only one daily observation per unique firm (permco) for their primary share class, so that I can declare the data as panel data.
Thus, I want to write a code that drops all firm-day observations that correspond to the non-primary share class of the firm, for all firms that have multiple share classes outstanding.
The challenges here are:
I can't use shrcd (=share code) because these are mostly the same (i.e. 11) for my data.
I can't use shrcls (=share class) because a quick inspection of my data reveals that firms do not consistently indicate their primary share class as class 'A'. Plus, shrcls often has missing values.
I can't use shrout (=shares outstanding) because although it is sometimes obvious from the number of shares outstanding, which share class is the primary. This is not always the case.
I can't use prc (=closing price) because although CRSP indicates the secondary shares buy giving negative values, this is not always the case.
Does anyone know how I could tackle this?
To elaborate further, in the first three rows of the -dataex- you see the same firm (permco: 54311) for a specific date (bdate: 2154) and it has shrcls A, B and C.
prc is roughly the same and both class A and C have a reasonable amount of share outstanding, so I can't tell which one is the primary share class.
I'm relatively new to Stata.
I need your help with some coding in stata. See -dataex- example below.
I have daily data on the closing stock price of all firms in the CRSP data base from 2006-2017.
Right now, my data is sorted on multiple shrcd permco bdate.
Using the following code, I created the multiple variable that indicates firm's that have duplicate observations on a specific day, i.e. they have multiple share classes trading.
Code:
duplicates tag permco bdate, generate(multiple)
Thus, I want to write a code that drops all firm-day observations that correspond to the non-primary share class of the firm, for all firms that have multiple share classes outstanding.
The challenges here are:
I can't use shrcd (=share code) because these are mostly the same (i.e. 11) for my data.
I can't use shrcls (=share class) because a quick inspection of my data reveals that firms do not consistently indicate their primary share class as class 'A'. Plus, shrcls often has missing values.
I can't use shrout (=shares outstanding) because although it is sometimes obvious from the number of shares outstanding, which share class is the primary. This is not always the case.
I can't use prc (=closing price) because although CRSP indicates the secondary shares buy giving negative values, this is not always the case.
Does anyone know how I could tackle this?
To elaborate further, in the first three rows of the -dataex- you see the same firm (permco: 54311) for a specific date (bdate: 2154) and it has shrcls A, B and C.
prc is roughly the same and both class A and C have a reasonable amount of share outstanding, so I can't tell which one is the primary share class.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long(permco permno) float(bdate prc) long shrout byte shrcd str1 shrcls int multiple 54311 14811 2154 47.27 228666 11 "C" 2 54311 13758 2154 49.99 9874 11 "B" 2 54311 13757 2154 48.94 104459 11 "A" 2 54311 14811 2155 46.49 228666 11 "C" 2 54311 13758 2155 49.36 9874 11 "B" 2 54311 13757 2155 48.26 104459 11 "A" 2 54311 13758 2156 -49.075 9874 11 "B" 2 54311 14811 2156 46.4 228666 11 "C" 2 54311 13757 2156 47.98 104459 11 "A" 2 54311 13757 2157 48.21 104459 11 "A" 2 54311 14811 2157 47.1 228666 11 "C" 2 54311 13758 2157 -49.365 9874 11 "B" 2 54311 13758 2158 49 9874 11 "B" 2 54311 14811 2158 47.82 228666 11 "C" 2 54311 13757 2158 48.39 104459 11 "A" 2 54311 13758 2159 48.205 9874 11 "B" 2 54311 13757 2159 47.05 104459 11 "A" 2 54311 14811 2159 47 228666 11 "C" 2 54311 14811 2160 46.54 228666 11 "C" 2 54311 13758 2160 -47.71 9874 11 "B" 2 end format %tbtrade_day bdate
Comment