Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Condition on consecutive years (creating a dummy variable)

    Hey everyone,

    I have to assign a dummy equal to one if a person has traded at least once for three consecutive years.

    I recently posted a similar thread: https://www.statalist.org/forums/for...dummy-variable , and Clyde very kind to provide me with an answer:

    Originally posted by Clyde Schechter View Post
    The first step is to extract the year from the month variable. Then the key step is to use the -rangestat- command, by Robert Picard, Nick Cox, and Roberto Ferrer, available from SSC, to determine whether there are purchases in the past 1 year, and again 2 years ago. Then just combine with &.

    Code:
    gen year = yofd(dofm(month))
    
    forvalues lag = 1/2 {
    rangestat (count) bought_`lag'_yrs_ago = tprice, by(personid) interval(year -`lag' -`lag')
    mvencode bought_`lag'_yrs_ago, mv(0)
    }
    
    gen byte wanted = bought_1_yrs_ago & bought_2_yrs_ago
    Note: In your example data, there is only one person, and that person has purchases in every year, so his/her result is always 1, except for the first two years where there is simply no data to refer back to.
    However, I realized that I have to use a different approach for my study - to assign a dummy if a person has traded for three consecutive years. I tried to alter Clyde's code, but with no luck. I wrote a follow up question on that thread, but I am not sure that it was able reach you. I am sorry for posting again, but these few lines are of great importance for my research. Here's a sample of the dataset that I am using:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long personid double(permno tprice) float(month year)
    482 86791      52 481 2000
    482 86791      69 484 2000
    482 86791      63 485 2000
    482 86791      41 523 2003
    482 86791   41.06 524 2003
    482 86791    40.4 525 2003
    482 86791      50 526 2003
    482 86791   47.19 529 2004
    482 86791 45.0002 531 2004
    482 86791    40.5 541 2005
    482 86791      41 542 2005
    482 86791      41 544 2005
    482 86791      41 546 2005
    482 86791      45 547 2005
    482 86791   44.05 548 2005
    482 86791    53.9 553 2006
    482 86791    34.9 559 2006
    482 86791    36.9 560 2006
    482 86791    38.9 561 2006
    482 86791    49.5 565 2007
    482 86791      37 611 2010
    482 86791      38 612 2011
    482 86791    47.9 613 2011
    482 89132   15.05 532 2004
    end
    format %tmMon_CCYY month
    I would really appreciate any help! Thank you very much!

    Best regards,
    Fanetti

  • #2
    To elaborate, I need to assign the dummy to the trades that are consecutive, not the individual. In the above example, the dummy should be equal to one for all trades in years 2003, 2004, 2005, 2006 and 2007, and zero otherwise.

    Any advice would be much appreciated!

    Comment


    • #3
      You need to identify spells of trades in consecutive years. With the data in chronological order, a new spell starts when the difference in years between the current trade and the one before is greater than one. Within spell by-groups, what you want is a spell where the difference in year between the most recent year and the oldest year is more than 2. Something like:

      Code:
      bysort personid (year month): gen spell = sum((year - year[_n-1]) > 1)
      bysort personid spell (year month): gen wanted = (year[_N] - year[1]) >= 2

      Comment


      • #4
        Thank you very much, Robert!

        Comment


        • #5
          Hello, I am having a similar issue. I want to create a routine trade variable based on Cohen et al. (2012) definition. That is, characterizing insiders as routine traders if they trade in the same calendar month in three consecutive years. Below is an idea of my data structure:
          PERSONID CUSIP6 TRANCODE ACQDISP TRANDATE TPRICE OWNERSHIP SHARES mdate
          18 539451 S D 02dec2005 28.76 D 500 2005m12
          22 22281N P A 17jul1998 26.13 D 1000 1998m7
          36 079860 S D 04jan2002 40.75 D 11400 2002m1
          36 079860 S D 03jan2002 40 D 62476 2002m1
          36 079860 S D 03jan2002 40 D 33800 2002m1
          40 219350 S D 20dec1996 43.63 D 30000 1996m12
          40 219350 S D 04mar1999 55.75 60000 1999m3
          40 219350 S D 20apr1999 58.84 40000 1999m4
          40 219350 S D 21apr1999 60.05 10665 1999m4
          40 219350 S D 21apr1999 59.26 D 24335 1999m4
          40 219350 S D 02mar2000 198.15 D 36000 2000m3
          40 219350 S D 02may2000 199.44 D 65000 2000m5
          40 219350 S D 01aug2000 238.33 D 60000 2000m8
          40 219350 S D 26jan2001 56.25 75000 2001m1
          40 219350 S D 29jan2001 57.53 D 85000 2001m1
          40 219350 S D 25jan2001 60.13 40000 2001m1
          70 24869P P A 03mar1998 9.75 D 200000 1998m3
          146 468201 P A 12jul1996 5.63 D 7000 1996m7
          146 468201 P A 03oct1996 4.63 1000 1996m10
          146 468201 P A 11oct1996 4.88 D 1000 1996m10
          146 468201 P A 10oct1996 4.88 1000 1996m10
          146 468201 P A 27nov1996 4.25 D 3300 1996m11
          146 468201 P A 03dec1996 4 D 5700 1996m12
          146 468201 P A 26aug1997 27.95 12000 1997m8
          146 468201 P A 16sep1997 38.25 D 2350 1997m9
          146 468201 S D 22dec1997 67.24 D 45580 1997m12
          146 87951U P A 26aug1998 5.75 D 56100 1998m8
          146 87951U P A 27aug1998 5.36 43900 1998m8
          146 960878 P A 13mar2015 23.95 D 18100 2015m3
          147 20564D S D 17sep1999 36 D 77074 1999m9
          147 20564D S D 17sep1999 36 I 871.162 1999m9
          154 817320 P A 03jan1996 30.5 I 53800 1996m1
          154 817320 P A 01feb1996 32 I 25200 1996m2
          154 817320 P A 13mar1996 34 I 104600 1996m3
          154 817320 S D 10dec2004 60.5 I 64030 2004m12
          154 817320 P A 10dec2004 60.5 I 64030 2004m12
          186 002535 S D 14aug2013 27.942 D 3375 2013m8
          186 002535 S D 14aug2014 25.6552 D 2250 2014m8
          197 032511 S D 28jun1996 57.5 D 15000 1996m6
          197 032511 S D 03jul1996 60 10000 1996m7
          197 032511 S D 10jul1996 60 10000 1996m7
          197 032511 S D 21oct1996 65.5 5300 1996m10
          197 032511 S D 21oct1996 65.88 8500 1996m10
          197 032511 S D 21oct1996 66.25 4700 1996m10
          197 032511 S D 24oct1996 65 4700 1996m10
          197 032511 S D 21oct1996 65.63 500 1996m10
          197 032511 S D 24oct1996 65.38 D 300 1996m10
          197 032511 S D 21oct1996 66.38 21000 1996m10
          197 032511 S D 23oct1996 65.25 10000 1996m10
          197 032511 S D 23oct1996 65.13 5000 1996m10
          197 032511 S D 06apr1998 71.88 D 10000 1998m4
          197 032511 S D 06apr1998 71.94 10000 1998m4
          197 032511 S D 08aug2000 51.02 D 80000 2000m8
          197 032511 S D 10aug2000 54.3 D 80000 2000m8
          197 032511 S D 11sep2000 68.25 D 120000 2000m9
          197 032511 S D 12sep2000 68.51 D 35859 2000m9
          197 032511 S D 29nov2000 63.18 D 120000 2000m11
          197 032511 S D 22dec2000 63.93 D 120000 2000m12
          197 032511 S D 15jan2002 50 I 33308 2002m1


          So far, below is what I tried but the results seem odd to me AS ONLY 1% of the trades in my data are routine trades. However, Cohen et al. have more than 50% of the trades in their data as routine trades. This tells me that something might be wrong with my code.
          Any help will be greatly appreciated.

          gen monthT=month( TRANDATE)
          gen yearT=year( TRANDATE)
          sort PERSONID yearT monthT
          bysort PERSONID (yearT monthT): gen spell = sum((yearT != yearT[_n-1] + 1) & (monthT != monthT[_n-1]))
          bysort PERSONID spell (yearT monthT): gen routine = (yearT[_N] - yearT[1] >= 1) & (monthT[_N] == monthT[1])


          Thank you in advance for your help!


          Comment


          • #6
            #5 is a duplicate post. Asked and answered at https://www.statalist.org/forums/for...routine-trades.

            Comment

            Working...
            X