Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging datasets

    Dear all,

    I currently have one dataset with IPO data for a specific sample of firms and sample period. Among these data I have the lockup expiration dates and I want to calculate
    cumulative abnormal returns for a time frame 3 days before and 3 days after. I have obtained the returns from CRSP and currently have them in another dataset. Nevertheless I have thousands of returns for each firm covering the whole sample period while I only need the returns from t=-3 to t=+3. So I want to ask how I can merge the two datasets keeping only the returns I am interested in (the ones around the lockup expiration dates).

    Thank you in advance




    Last edited by papei; 20 Sep 2014, 15:33.

  • #2
    I have no idea what 'lockup expiration dates' and 'cumulative abnormal returns' are, but I think one way to tackle your problem would be to first drop all observations outside your desired range and then merge only when there is a match.

    For the example code below:
    t= days since expiration date(?) (I assume you already have a variable that expresses whether it is 3 days before or 3 days after the expiration date)
    firmid= identification number of the firm that I assume is present in both your IPO and CRSP dataset.
    date = I assume there is a data variable in both IPO and CRSP datasets so that you can match the returns of each day

    ~~~~example code~~~~~~~~~

    keep if t>=-3 & t<=3 //this keeps only the range from -3 to +3 days
    sort firmid date //you always need to sort before you can use "joinby"
    joinby firmid date using "[add you CRSP file path name here]", unmatched(master) //'unmatched(master) ensures Stata only adds those cases for the range t=-3/3

    ~~~~end of example code~~~~


    Comment


    • #3
      I just noticed you first posted in the general forum and reading the reply made me realise the structure of your IPO dataset is different from what I imagined

      I suspect the error message you got (not unique) is due to the IPO sample dataset containing multiple lock-up expiration dates per firm. Joseph's code created a -3/3 variable for each lock-up expiration date, so if a firm has two (or more) lock-up expiration dates the values on 'return_date' are not unique within firms.

      Tweaking Joseph Coveney's code to create a variable that is a date rather than -3/3 variable:

      Code:
      use "IPO data", clear
      expand 7
      generate return_date = lockup_expiration_date //name of date variable in "returns from CRSP" dataset
      format return_date %dD_m_Y //this puts the variable in date format.
                                                  //This should be the same date format as your
                                                  //lockup_expiration_date variable
      bysort firm_id: replace return_date = return_date + (_n - 4)
      joinby firm_id return_date using "[CRSP file path]", unmatched(master)
      You may need to adjust this code, depending on the format in which the date is stored.
      The date format will need to be the same in the IPO and CRSP datasets.


      If both datasets contain a variable called 'date' you will need to adjust the code by first renaming the variable in the CRSP data set:

      Code:
      use "CRSP data", clear
      rename date=return_date
      save "CRSP_returndate", replace
      
      use "IPO data", clear
      expand 7
      generate return_date = lockup_expiration_date
      format return_date %dD_m_Y
      bysort firm_id: replace return_date = return_date + (_n - 4)
      joinby firm_id return_date using "[CRSP_returndate file path]", unmatched(master)
      You may still get an error message about non-unique observations if the lock-up expiration codes for a firm are less than 7 days apart.
      If you get an error message about non-unique observations, this would be the first thing to check
      Last edited by Evelyn Ersanilli; 20 Sep 2014, 19:07.

      Comment


      • #4
        Thank you for your help. It seems that Joseph Coveney's code works. I had to make the date format the same for both datasets apparently but I did not know that since I am a new Stata user. So thank you for that.

        However I noticed that I have another problem. I got the returns from CRSP but apparently this database does not offer data for weekends. As a result while the code worked I do not have values for some days (weekend days). I guess that when I have my event dates (lockup expiration dates) on non-trading dates (weekends), there should be a conversion of these non-trading dates to the next trading date i.e Monday. Also even if I have my event date on a trading date, when some of the days in my event window coincide with weekends my event window should be extended so that I could get values for all seven days. For example if my lockup expiration day is on Wednesday I do have the returns from Monday until Friday but I miss values for Sunday (third day before the event) and Saturday (third day after the event). So instead of Sunday where there are no values I would like to use Friday preceding that and instead of Saturday use Monday following that.

        I know it sounds complicated but any kind of help would be really appreciated.
        Last edited by papei; 21 Sep 2014, 12:47.

        Comment


        • #5
          I can't give you specific advice because I have never used these myself, as these issues don't arise in my line of work, but I think you need to read [D] Datetime Business Calendars in the online manuals. They are designed specifically to deal with weekend and holiday and other calendar gap issues such as the one you are facing.

          Comment


          • #6
            Clyde Schechter thank you very much for the information. I will try to figure it out.

            Comment

            Working...
            X