Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Date manipulation + volume weighted average

    Dear Statalist users,

    I am a master student from Maastricht University trying to "replicate" parts of a financial article: "Impact of the Dodd-Frank act on credit rating" (Dimitrov et al., 2015). I have basic knowledge in Stata, I managed to make my way till this step but now I m stuck, help is more than greatly appreciated.

    I am using Stata 12.

    cusip_id correspond to different bonds which ratings have been upgraded or downgraded on particular dates (rating_date)
    execution_date corresponds to trades of each bonds that have been made, with corresponding quantity and price next to it.

    What I am trying to do is keep bonds for which there has been at least one trade before and after the rating date. For example, in the example generated by dataex, I would keep the cusip_id "200300606" with a rating date of either 13Feb2013 or 14Feb2013 because each has at least one trade ("execution_trade") before and after the rating dates. In opposition, I would not keep the cusip_id with a rating date of 12Mar2012 because it only has trades before the rating date and not a single trade after.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long rating_date str4 reason str14 cusip_id long execution_date double(quanti price)
    19402 "UPG" "200300606" 19054 4860364.5      40.5
    19402 "UPG" "200300606" 19072    770000      38.5
    19402 "UPG" "200300606" 19072    782500    39.125
    19402 "UPG" "200300606" 19121   2031250    40.625
    19402 "UPG" "200300606" 19121   2025000      40.5
    19402 "UPG" "200300606" 19124    2887.5      38.5
    19402 "UPG" "200300606" 19124    2812.5      37.5
    19402 "UPG" "200300606" 19124    2887.5      38.5
    19402 "UPG" "200300606" 19225 3116814.4      43.4
    19402 "UPG" "200300606" 19225 3113223.6     43.35
    19402 "UPG" "200300606" 19723     30000     45.95
    19402 "UPG" "200300606" 19723     30000        46
    19402 "UPG" "200300606" 19751     75000     46.25
    19402 "UPG" "200300606" 19751     25000     46.25
    19402 "UPG" "200300606" 19751    100000    46.125
    19402 "UPG" "200300606" 19754     25000    46.375
    19402 "UPG" "200300606" 19754     25000      46.5
    19403 "UPG" "200300606" 19054 4860364.5      40.5
    19403 "UPG" "200300606" 19072    782500    39.125
    19403 "UPG" "200300606" 19072    770000      38.5
    19403 "UPG" "200300606" 19121   2031250    40.625
    19403 "UPG" "200300606" 19121   2025000      40.5
    19403 "UPG" "200300606" 19124    2812.5      37.5
    19403 "UPG" "200300606" 19124    2887.5      38.5
    19403 "UPG" "200300606" 19124    2887.5      38.5
    19403 "UPG" "200300606" 19225 3113223.6     43.35
    19403 "UPG" "200300606" 19225 3116814.4      43.4
    19403 "UPG" "200300606" 19723     30000     45.95
    19403 "UPG" "200300606" 19723     30000        46
    19403 "UPG" "200300606" 19751     75000     46.25
    19403 "UPG" "200300606" 19751    100000    46.125
    19403 "UPG" "200300606" 19751     25000     46.25
    19403 "UPG" "200300606" 19754     25000    46.375
    19403 "UPG" "200300606" 19754     25000      46.5
    19064 "UPG" "210387205" 19018  24995.84   27.1399
    19064 "UPG" "210387205" 19018  24995.84 27.187908
    19064 "UPG" "210387205" 19019  11039.14 25.011932
    19064 "UPG" "210387205" 19019  11039.14     27.19
    19064 "UPG" "210387205" 19022      2717     27.17
    19064 "UPG" "210387205" 19022  24338.52   27.0428
    19064 "UPG" "210387205" 19022      2705     27.05
    19064 "UPG" "210387205" 19022      2705   26.7561
    19064 "UPG" "210387205" 19022  24338.52   27.0428
    19064 "UPG" "210387205" 19022   2715.99   27.1599
    19064 "UPG" "210387205" 19022      2717 27.462602
    19064 "UPG" "210387205" 19022   2715.99 27.452611
    19064 "UPG" "210387205" 19024      2714 27.432926
    19064 "UPG" "210387205" 19024      2714     27.14
    19064 "UPG" "210387205" 19024  18983.93   27.1199
    19064 "UPG" "210387205" 19024  18983.93 27.161778
    end
    format %d rating_date
    format %d execution_date
    Then, if anyone has time, I might need an advice with regards to how I could properly calculate the volume weighted average across the days before the rating_date and after the rating_date (this to calculate the impact of a rating change on the price of a bond).

    Thank you very much for you time,

    Pierre Louveau
    Last edited by Pierre Louveau; 27 Sep 2016, 05:05.

  • #2
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long rating_date str4 reason str14 cusip_id long execution_date double(quanti price)
    19402 "UPG" "200300606" 19054 4860364.5      40.5
    19402 "UPG" "200300606" 19072    770000      38.5
    19402 "UPG" "200300606" 19072    782500    39.125
    19402 "UPG" "200300606" 19121   2031250    40.625
    19402 "UPG" "200300606" 19121   2025000      40.5
    19402 "UPG" "200300606" 19124    2887.5      38.5
    19402 "UPG" "200300606" 19124    2812.5      37.5
    19402 "UPG" "200300606" 19124    2887.5      38.5
    19402 "UPG" "200300606" 19225 3116814.4      43.4
    19402 "UPG" "200300606" 19225 3113223.6     43.35
    19402 "UPG" "200300606" 19723     30000     45.95
    19402 "UPG" "200300606" 19723     30000        46
    19402 "UPG" "200300606" 19751     75000     46.25
    19402 "UPG" "200300606" 19751     25000     46.25
    19402 "UPG" "200300606" 19751    100000    46.125
    19402 "UPG" "200300606" 19754     25000    46.375
    19402 "UPG" "200300606" 19754     25000      46.5
    19403 "UPG" "200300606" 19054 4860364.5      40.5
    19403 "UPG" "200300606" 19072    782500    39.125
    19403 "UPG" "200300606" 19072    770000      38.5
    19403 "UPG" "200300606" 19121   2031250    40.625
    19403 "UPG" "200300606" 19121   2025000      40.5
    19403 "UPG" "200300606" 19124    2812.5      37.5
    19403 "UPG" "200300606" 19124    2887.5      38.5
    19403 "UPG" "200300606" 19124    2887.5      38.5
    19403 "UPG" "200300606" 19225 3113223.6     43.35
    19403 "UPG" "200300606" 19225 3116814.4      43.4
    19403 "UPG" "200300606" 19723     30000     45.95
    19403 "UPG" "200300606" 19723     30000        46
    19403 "UPG" "200300606" 19751     75000     46.25
    19403 "UPG" "200300606" 19751    100000    46.125
    19403 "UPG" "200300606" 19751     25000     46.25
    19403 "UPG" "200300606" 19754     25000    46.375
    19403 "UPG" "200300606" 19754     25000      46.5
    19064 "UPG" "210387205" 19018  24995.84   27.1399
    19064 "UPG" "210387205" 19018  24995.84 27.187908
    19064 "UPG" "210387205" 19019  11039.14 25.011932
    19064 "UPG" "210387205" 19019  11039.14     27.19
    19064 "UPG" "210387205" 19022      2717     27.17
    19064 "UPG" "210387205" 19022  24338.52   27.0428
    19064 "UPG" "210387205" 19022      2705     27.05
    19064 "UPG" "210387205" 19022      2705   26.7561
    19064 "UPG" "210387205" 19022  24338.52   27.0428
    19064 "UPG" "210387205" 19022   2715.99   27.1599
    19064 "UPG" "210387205" 19022      2717 27.462602
    19064 "UPG" "210387205" 19022   2715.99 27.452611
    19064 "UPG" "210387205" 19024      2714 27.432926
    19064 "UPG" "210387205" 19024      2714     27.14
    19064 "UPG" "210387205" 19024  18983.93   27.1199
    19064 "UPG" "210387205" 19024  18983.93 27.161778
    end
    format %d rating_date
    format %d execution_date
    
    //    ELIMINATE CASES LACKING PRE- AND POST-RATING TRADE
    assert !missing(cusip_id, rating_date, execution_date)
    by cusip_id rating_date, sort: egen has_before = ///
        max(execution_date < rating_date)
    by cusip_id rating_date: egen has_after = ///
        max(execution_date > rating_date)
    keep if has_before & has_after
    
    //    CALCULATE VOLUME WEIGHTED AVERAGE PRICE
    //    BOTH BEFORE AND AFTER
    gen byte era = 1 if execution_date < rating_date
    replace era = 2 if execution_date > rating_date
    by cusip_id rating_date era, sort: egen numerator = total(quanti*price)
    by cusip_id rating_date era: egen denominator = total(quanti)
    gen vol_weighted_avg = numerator/denominator
    Notes:
    1. I assume that cusip, rating_date, and execution_date are never missing. This is verified in the -assert- statement.
    2. I assume that the variable quanti is what you have in mind when you refer to volume weighting of the average.
    3. In your example data, execution_date is never the same as rating_date. But that might occur in the full data set. You don't say how you would want to handle that situation. In the code above, such observations are treated as being neither before nor after the rating date (neither for deciding which observations to keep, nor for calculating volume-weighted averages). If that is not your intent, you can adjust some < or > symbols to <= or >= accordingly.

    Do familiarize yourself with the -egen- command. It is full of very useful functions for data management.

    Comment


    • #3
      If you group observations by cusp_id and rating_date, you can then count the number of observations within the group that have an execution date before and after the rating date.

      Code:
      sort cusip_id rating_date execution_date
      by cusip_id rating_date: egen before = total(rating_date > execution_date)
      by cusip_id rating_date: egen after = total(rating_date < execution_date)
      gen tokeep = before > 0 & after > 0

      Comment


      • #4
        I have implemented both versions of the solutions and both work very well. Thank you so much !!!

        @ Clyde, I am sorry that my problem lacked some details. Your assumption were exactly what I had in mind. Concerning the Note n°3, it is probable that execution_date will never be the same as rating_date because of the high illiquidity of the bond market but in case it happens, I will treat these observations as "after" the rating date.

        Again, thank you very much, I will not forget to mention the help I receive in my paper.

        Pierre Louveau

        Comment

        Working...
        X