Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating a new variable while using a certain subgroup of the dataset

    Hello all,

    I am having with my code. The dataset I am working with looks like this:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long pair_num double secid long(datadate exdate) float(nstrike_price straddle_price)
     1 100974 13830 13868 30 3.90625
     1 100974 13830 13868 30 3.90625
     2 100974 13832 13868 30  3.8125
     2 100974 13832 13868 30  3.8125
     3 100974 13835 13868 30 3.78125
     3 100974 13835 13868 30 3.78125
     4 100974 13838 13868 35  3.8125
     4 100974 13838 13868 35  3.8125
     5 100974 13842 13868 30 3.71875
     5 100974 13842 13868 30 3.71875
     6 100974 13842 13896 30 4.71875
     6 100974 13842 13896 30 4.71875
     7 100974 13843 13896 30 4.71875
     7 100974 13843 13896 30 4.71875
     8 100974 13844 13868 30 3.34375
     8 100974 13844 13868 30 3.34375
     9 100974 13844 13896 30    4.75
     9 100974 13844 13896 30    4.75
    10 100974 13846 13868 30  3.1875
    10 100974 13846 13868 30  3.1875
    11 100974 13846 13896 30    4.75
    11 100974 13846 13896 30    4.75
    12 100974 13849 13868 30       3
    12 100974 13849 13868 30       3
    13 100974 13849 13896 30   4.625
    13 100974 13849 13896 30   4.625
    14 100974 13850 13868 30 2.90625
    14 100974 13850 13868 30 2.90625
    15 100974 13851 13868 30 2.84375
    15 100974 13851 13868 30 2.84375
    end
    format %td datadate
    format %td exdate
    label var secid "Security ID" 
    label var datadate "The Date of this Price" 
    label var exdate "Expiration Date of the Option"

    I have matched observations with the same "secid", same "datadate", same "exdate", same "nstrike_price", and same "straddle_price". I would like to be able to calculate the change in "straddle_price", which can be expressed as: straddle_price[_n+1]- straddle_price. This [_n+1] just signifies the next value in "datadate" (so from 13830 to 13832, for example); the "exdate", "secid", and "nstrike_price" should remain the same. Does someone know how this could be done?

    Thanks!

  • #2
    Is this what you want?

    Code:
    bys secid (datadate): gen wanted= straddle_price[_n+1]- straddle_price if straddle_price[_n+1]!= straddle_price
    bys secid datadate (wanted): replace wanted= wanted[1]

    Comment


    • #3
      Not quite Andrew, but I think the problem here is that I am working with panel data, and I must use xtset

      Comment


      • #4
        What you are showing is not panel data. It is matched pairs, in long layout, from data that perhaps originally was panel data before you created the pairs. Moreover, your dates are spaced irregularly, so you will not be able to use time series operators anyway.

        Try this:
        Code:
        frame put pair_num secid datadate exdate nstrike_price straddle_price, into(working)
        frame working {
            duplicates drop
            by secid exdate nstrike_price (datadate), sort: gen wanted = ///
                straddle_price[_n+1]-straddle_price
        }
        frlink m:1 pair_num, frame(working)
        frget wanted, from(working)

        Comment


        • #5
          Thank you Clyde. That's exactly what I was looking for

          Comment

          Working...
          X