Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with identifying routine trades.

    Hello, I am having a similar issue. I want to create a routine trade variable based on Cohen et al. (2012) definition. That is, characterizing insiders as routine traders if they trade in the same calendar month in three consecutive years. Below is an idea of my data structure:


    PERSONID CUSIP6 TRANCODE ACQDISP TRANDATE TPRICE OWNERSHIP SHARES mdate
    18 539451 S D 02dec2005 28.76 D 500 2005m12
    22 22281N P A 17jul1998 26.13 D 1000 1998m7
    36 079860 S D 04jan2002 40.75 D 11400 2002m1
    36 079860 S D 03jan2002 40 D 62476 2002m1
    36 079860 S D 03jan2002 40 D 33800 2002m1
    40 219350 S D 20dec1996 43.63 D 30000 1996m12
    40 219350 S D 04mar1999 55.75 60000 1999m3
    40 219350 S D 20apr1999 58.84 40000 1999m4
    40 219350 S D 21apr1999 60.05 10665 1999m4
    40 219350 S D 21apr1999 59.26 D 24335 1999m4
    40 219350 S D 02mar2000 198.15 D 36000 2000m3
    40 219350 S D 02may2000 199.44 D 65000 2000m5
    40 219350 S D 01aug2000 238.33 D 60000 2000m8
    40 219350 S D 26jan2001 56.25 75000 2001m1
    40 219350 S D 29jan2001 57.53 D 85000 2001m1
    40 219350 S D 25jan2001 60.13 40000 2001m1
    70 24869P P A 03mar1998 9.75 D 200000 1998m3
    146 468201 P A 12jul1996 5.63 D 7000 1996m7
    146 468201 P A 03oct1996 4.63 1000 1996m10
    146 468201 P A 11oct1996 4.88 D 1000 1996m10
    146 468201 P A 10oct1996 4.88 1000 1996m10
    146 468201 P A 27nov1996 4.25 D 3300 1996m11
    146 468201 P A 03dec1996 4 D 5700 1996m12
    146 468201 P A 26aug1997 27.95 12000 1997m8
    146 468201 P A 16sep1997 38.25 D 2350 1997m9
    146 468201 S D 22dec1997 67.24 D 45580 1997m12
    146 87951U P A 26aug1998 5.75 D 56100 1998m8
    146 87951U P A 27aug1998 5.36 43900 1998m8
    146 960878 P A 13mar2015 23.95 D 18100 2015m3
    147 20564D S D 17sep1999 36 D 77074 1999m9
    147 20564D S D 17sep1999 36 I 871.162 1999m9
    154 817320 P A 03jan1996 30.5 I 53800 1996m1
    154 817320 P A 01feb1996 32 I 25200 1996m2
    154 817320 P A 13mar1996 34 I 104600 1996m3
    154 817320 S D 10dec2004 60.5 I 64030 2004m12
    154 817320 P A 10dec2004 60.5 I 64030 2004m12
    186 002535 S D 14aug2013 27.942 D 3375 2013m8
    186 002535 S D 14aug2014 25.6552 D 2250 2014m8
    197 032511 S D 28jun1996 57.5 D 15000 1996m6
    197 032511 S D 03jul1996 60 10000 1996m7
    197 032511 S D 10jul1996 60 10000 1996m7
    197 032511 S D 21oct1996 65.5 5300 1996m10
    197 032511 S D 21oct1996 65.88 8500 1996m10
    197 032511 S D 21oct1996 66.25 4700 1996m10
    197 032511 S D 24oct1996 65 4700 1996m10
    197 032511 S D 21oct1996 65.63 500 1996m10
    197 032511 S D 24oct1996 65.38 D 300 1996m10
    197 032511 S D 21oct1996 66.38 21000 1996m10
    197 032511 S D 23oct1996 65.25 10000 1996m10
    197 032511 S D 23oct1996 65.13 5000 1996m10
    197 032511 S D 06apr1998 71.88 D 10000 1998m4
    197 032511 S D 06apr1998 71.94 10000 1998m4
    197 032511 S D 08aug2000 51.02 D 80000 2000m8
    197 032511 S D 10aug2000 54.3 D 80000 2000m8
    197 032511 S D 11sep2000 68.25 D 120000 2000m9
    197 032511 S D 12sep2000 68.51 D 35859 2000m9
    197 032511 S D 29nov2000 63.18 D 120000 2000m11
    197 032511 S D 22dec2000 63.93 D 120000 2000m12
    197 032511 S D 15jan2002 50 I 33308 2002m1


    So far, below is what I tried [following an answer on this forum] but the results seem odd to me AS ONLY 1% of the trades in my data are routine trades. However, Cohen et al. have more than 50% of the trades in their data as routine trades. This tells me that something might be wrong with my code.
    Any help will be greatly appreciated.

    gen monthT=month( TRANDATE)
    gen yearT=year( TRANDATE)
    sort PERSONID yearT monthT
    bysort PERSONID (yearT monthT): gen spell = sum((yearT != yearT[_n-1] + 1) & (monthT != monthT[_n-1]))
    bysort PERSONID spell (yearT monthT): gen routine = (yearT[_N] - yearT[1] >= 1) & (monthT[_N] == monthT[1])


    Thank you in advance!

  • #2
    Well, I don't think your code is correct--if nothing else, it lacks anything that checks whether a run lasts for 3 years. I used a related, but different in detail, calculation, shown below. In the example data you show, there are no instances at all of a three-year run for the same trader making trades in the same month. Presumably in your real data, there are at least a few such instances. Now, there is a subtlety of interpretation that you may be missing. Your code (were it correct), and mine below (which I believe is correct if I understand what is wanted), identify a particular month in which a trader can be considered routine. But there is a gap between that and the trader being routine. If the trader being routine means that he/she is routine in any month, then even a small percentage of trader-months being routine might correspond to a large fraction of the traders being routine.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int personid str6 cusip6 str1(trancode acqdisp) float(trandate tprice) str5 ownership str7 shares float mdate
     18 "539451" "S" "D" 16772   28.76 "D" "500"     551
     22 "22281N" "P" "A" 14077   26.13 "D" "1000"    462
     36 "079860" "S" "D" 15344   40.75 "D" "11400"   504
     36 "079860" "S" "D" 15343      40 "D" "62476"   504
     36 "079860" "S" "D" 15343      40 "D" "33800"   504
     40 "219350" "S" "D" 13503   43.63 "D" "30000"   443
     40 "219350" "S" "D" 14307   55.75 ""  "60000"   470
     40 "219350" "S" "D" 14354   58.84 ""  "40000"   471
     40 "219350" "S" "D" 14355   60.05 ""  "10665"   471
     40 "219350" "S" "D" 14355   59.26 "D" "24335"   471
     40 "219350" "S" "D" 14671  198.15 "D" "36000"   482
     40 "219350" "S" "D" 14732  199.44 "D" "65000"   484
     40 "219350" "S" "D" 14823  238.33 "D" "60000"   487
     40 "219350" "S" "D" 15001   56.25 ""  "75000"   492
     40 "219350" "S" "D" 15004   57.53 "D" "85000"   492
     40 "219350" "S" "D" 15000   60.13 ""  "40000"   492
     70 "24869P" "P" "A" 13941    9.75 "D" "200000"  458
    146 "468201" "P" "A" 13342    5.63 "D" "7000"    438
    146 "468201" "P" "A" 13425    4.63 ""  "1000"    441
    146 "468201" "P" "A" 13433    4.88 "D" "1000"    441
    146 "468201" "P" "A" 13432    4.88 ""  "1000"    441
    146 "468201" "P" "A" 13480    4.25 "D" "3300"    442
    146 "468201" "P" "A" 13486       4 "D" "5700"    443
    146 "468201" "P" "A" 13752   27.95 ""  "12000"   451
    146 "468201" "P" "A" 13773   38.25 "D" "2350"    452
    146 "468201" "S" "D" 13870   67.24 "D" "45580"   455
    146 "87951U" "P" "A" 14117    5.75 "D" "56100"   463
    146 "87951U" "P" "A" 14118    5.36 ""  "43900"   463
    146 "960878" "P" "A" 20160   23.95 "D" "18100"   662
    147 "20564D" "S" "D" 14504      36 "D" "77074"   476
    147 "20564D" "S" "D" 14504      36 "I" "871.162" 476
    154 "817320" "P" "A" 13151    30.5 "I" "53800"   432
    154 "817320" "P" "A" 13180      32 "I" "25200"   433
    154 "817320" "P" "A" 13221      34 "I" "104600"  434
    154 "817320" "S" "D" 16415    60.5 "I" "64030"   539
    154 "817320" "P" "A" 16415    60.5 "I" "64030"   539
    186 "002535" "S" "D" 19584  27.942 "D" "3375"    643
    186 "002535" "S" "D" 19949 25.6552 "D" "2250"    655
    197 "032511" "S" "D" 13328    57.5 "D" "15000"   437
    197 "032511" "S" "D" 13333      60 ""  "10000"   438
    197 "032511" "S" "D" 13340      60 ""  "10000"   438
    197 "032511" "S" "D" 13443    65.5 ""  "5300"    441
    197 "032511" "S" "D" 13443   65.88 ""  "8500"    441
    197 "032511" "S" "D" 13443   66.25 ""  "4700"    441
    197 "032511" "S" "D" 13446      65 ""  "4700"    441
    197 "032511" "S" "D" 13443   65.63 ""  "500"     441
    197 "032511" "S" "D" 13446   65.38 "D" "300"     441
    197 "032511" "S" "D" 13443   66.38 ""  "21000"   441
    197 "032511" "S" "D" 13445   65.25 ""  "10000"   441
    197 "032511" "S" "D" 13445   65.13 ""  "5000"    441
    197 "032511" "S" "D" 13975   71.88 "D" "10000"   459
    197 "032511" "S" "D" 13975   71.94 ""  "10000"   459
    197 "032511" "S" "D" 14830   51.02 "D" "80000"   487
    197 "032511" "S" "D" 14832    54.3 "D" "80000"   487
    197 "032511" "S" "D" 14864   68.25 "D" "120000"  488
    197 "032511" "S" "D" 14865   68.51 "D" "35859"   488
    197 "032511" "S" "D" 14943   63.18 "D" "120000"  490
    197 "032511" "S" "D" 14966   63.93 "D" "120000"  491
    197 "032511" "S" "D" 15355      50 "I" "33308"   504
    end
    format %td trandate
    format %tm mdate
    
    gen month = month(trandate)
    gen year = year(trandate)
    
    tempname working
    frame put personid month year, into(`working')
    frame `working' {
        duplicates drop
        by personid month (year), sort: gen run = sum(year[_n+1] == year + 1)
        by personid month run, sort: gen routine  = (_N >= 3)
    }
    frlink m:1 personid month year, frame(`working')
    frget routine, from(`working')
    
    by personid (routine), sort: gen trader_is_routine = routine[_N]
    That said, the definition of routine you describe in #1 strikes me as rather bizarre. I would think that being "routine" would depend on passing some threshold of frequency of trading, and the compression of trading into the same specific calendar month in consecutive years does not seem relevant to routine, at least as that word is normally understood. So I would just urge you to recheck that you are correctly understanding the article you are correctly understanding the article. I can't do that myself. The citation "Cohen et al. (2012)" may be folklore in your network, but with no other information it is impossible for others to identify what article you have in mind. Checking Google scholar for articles authored by "Cohen" in 2012 turns up over 3,000,000 hits! That's why the Forum FAQ asks people, if they are going to cite articles, to provide complete citations. (I, personally, probably wouldn't be able to get the article even with a complete citation, because I do not work in finance and my institution's library almost surely would not have that journal, nor support unpaywalled access to it online.)

    Finally, you provided some listing output from your data to show the example. Your intent to give example data is appreciated. But listings like this are not as helpful as you might think. If you try to copy/paste it into Stata you will find that the dates come in as strings (which then have to be converted to proper Stata internal format dates before we can work on your problem). You will also find that the last several columns of the listing come in misaligned in the data set, so some "surgery" has to be performed to fix that. The most useful way to show example data is with the -dataex- command. In the future, please use it, as I have done in this post. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Thank you Clyde for your reply.

      After looking at the paper again, the definition is in fact different. Routine trade CEO trade that is routine and uninformative: A trade executed in the same calendar month in three or more consecutive years is considered a routine trade.
      Opportunistic trade CEO trade that possesses private information: A trade not executed in the same calendar month in three or more consecutive years is considered an opportunistic trade.

      Also, they define routine traders and opportunistic traders as follow: "we define a routine trader as an insider who placed a trade in the same calendar month for at least a certain number of years in the past.We then define opportunistic traders as everyone else, that is, those insiders who have traded in the same years as the routine insiders, but for whom we cannot detect an obvious discernible pattern in the past timing of their trades."

      I suppose the code above would not be correct then? Any help in finding the correct code will be appreciated.

      Also, for reference is
      Cohen, L., Malloy, C., & Pomorski, L. (2012). Decoding inside information. The Journal of Finance, 67(3), 1009-1043.

      Comment


      • #4
        Well, the code shown in #2 is, I believe, correct with these definitions, if you interpret "a certain number of years in the past" as three years.

        Comment

        Working...
        X