Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Drop non-trading days

    Hello Statalist,

    I have queries about how to remove non-trading days. I use Stata v14.0. For simplicity, I use panel data, which have 10 firms for period 2006-2014. All data is in daily.


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float idc str21 company float caldate double(_marketvalue    _price    _returnindex)
    1 "ALUAR" 16803    4422 2.0100000000000002 6915.96
    1 "ALUAR" 16804  4395.6               1.99 6874.67
    1 "ALUAR" 16805  4461.6               2.02 6977.89
    1 "ALUAR" 16806  4474.8 2.0300000000000002 6998.54
    1 "ALUAR" 16807    4554               2.07 7122.41
    1 ""      16808       .                  .       .
    1 ""      16809       .                  .       .
    1 "ALUAR" 16810 4593.59               2.08 7184.34
    1 "ALUAR" 16811 4593.59               2.08 7184.34
    1 "ALUAR" 16812 4593.59               2.08 7184.34
    1 "ALUAR" 16813 4580.39               2.08  7163.7
    1 "ALUAR" 16814  4540.8               2.06 7101.76
    1 ""      16815       .                  .       .
    1 ""      16816       .                  .       .
    1 "ALUAR" 16817  4501.2               2.04 7039.83
    end
    format %td caldate

    I would like to create Return as a new variable using this command:

    gen return = ln(_price /lag_price)*100

    My question is, in order to generate return, should I use the ordinary calendar date or use the business calendar date? If I use the business calendar date, there are lots of missing value.

    After generated the return, I would like to drop the non-trading days. Here, the non-trading days defined as days on which 90% or more of the stocks listed have a return equal to zero. I have tried calculated manually but it is not an efficient way. Does anyone have ideas how to solve it?

    The last question is, I would like to drop a stock if the number of zero-return days is more than 80% in a given month. I have tried use this command:

    tab return if return ==0

    However, this is not what I want because it described in daily. How can I jumped from the daily return to the 80% in a month, let say zero-return stocks >80% in January, February and so on?

    I really need help from Statalist. Your help is highly appreciated.

    Regards,
    Rozita

  • #2
    You will have fewer missing values if you correctly use the business calendar than if you use your variable caldate. Compare return and return2

    Code:
    clear
    ***Open your dataset
    input float idc str21 company float caldate double(_marketvalue    _price    _returnindex)
    1 "ALUAR" 16803    4422 2.0100000000000002 6915.96
    1 "ALUAR" 16804  4395.6               1.99 6874.67
    1 "ALUAR" 16805  4461.6               2.02 6977.89
    1 "ALUAR" 16806  4474.8 2.0300000000000002 6998.54
    1 "ALUAR" 16807    4554               2.07 7122.41
    1 ""      16808       .                  .       .
    1 ""      16809       .                  .       .
    1 "ALUAR" 16810 4593.59               2.08 7184.34
    1 "ALUAR" 16811 4593.59               2.08 7184.34
    1 "ALUAR" 16812 4593.59               2.08 7184.34
    1 "ALUAR" 16813 4580.39               2.08  7163.7
    1 "ALUAR" 16814  4540.8               2.06 7101.76
    1 ""      16815       .                  .       .
    1 ""      16816       .                  .       .
    1 "ALUAR" 16817  4501.2               2.04 7039.83
    end
    format %td caldate
    
    ***create business calendar from your data
    ***here, a business date is defined as a date where
    ***you have no data on a company
    drop if company==""
    bcal create mybcal, from(caldate) replace
    
    **clear and open your dataset again
    clear
    input float idc str21 company float caldate double(_marketvalue    _price    _returnindex)
    1 "ALUAR" 16803    4422 2.0100000000000002 6915.96
    1 "ALUAR" 16804  4395.6               1.99 6874.67
    1 "ALUAR" 16805  4461.6               2.02 6977.89
    1 "ALUAR" 16806  4474.8 2.0300000000000002 6998.54
    1 "ALUAR" 16807    4554               2.07 7122.41
    1 ""      16808       .                  .       .
    1 ""      16809       .                  .       .
    1 "ALUAR" 16810 4593.59               2.08 7184.34
    1 "ALUAR" 16811 4593.59               2.08 7184.34
    1 "ALUAR" 16812 4593.59               2.08 7184.34
    1 "ALUAR" 16813 4580.39               2.08  7163.7
    1 "ALUAR" 16814  4540.8               2.06 7101.76
    1 ""      16815       .                  .       .
    1 ""      16816       .                  .       .
    1 "ALUAR" 16817  4501.2               2.04 7039.83
    end
    format %td caldate
    
    ***apply business calendar
    gen bcaldate=bofd("mybcal", caldate)
    format bcaldate %tbmybcal
    
    tsset bcaldate
    gen return = ln(_price /L._price)*100
    
    tsset caldate
    gen return2 =ln(_price /L._price)*100
    
    sort caldate
    list
    tssetbcaldate
    The one based on the business calendar (return) is missing values on Sat & Sun, whereas the one based on the caldate (return2) is missing Mondays as well because the lag of _price on Mondays (Sundays) is also missing.

    To drop the companies with fewer than 80% of trading days, you want to get a monthly variable and then calculate by month and company:

    Code:
    *****************
    ***NON-TRADE DAYS
    *****************
    
    gen monthly=mofd(bcaldate)
    format monthly %tm
    
    **count the number of non-trade days
    bysort company monthly: egen flag=count(return) if return==0
    
    **spread that number to all days in the month
    bysort company monthly: egen nzero=mean(flag)
    
    **count the total number of days (when return is not missing)
    bysort company monthly: egen ndays=count(return)
    
    *get the percent of non-trade days for a company in a month
    bysort company monthly: gen non_trade=nzero/ndays * 100
    
    list
    drop if non_trade>80  //drop if greater than 80%
    
    **clean up
    drop flag nzero ndays
    Stata/MP 14.1 (64-bit x86-64)
    Revision 19 May 2016
    Win 8.1

    Comment


    • #3
      Dear Mr Carole J. Wilson,

      Thank you for the response. I have tried the commands up to create business calendar days, and they work. However, when I type command:
      . tsset bcaldate
      I get an error with the dialog is "repeated time values in sample":
      repeated time values within panel r(451);
      I have tried many times in case I have repeated the format, but the same error dialog would appear.

      I learned from the website that this problem is very serious because Stata is unable to proceed with any commands that depend on my data being as panel data. That is true because when I proceed to next command to calculate return, I get report:
      time variable not set r(111) Do you have any ideas how to solve this matter? Help from others is highly welcome. Thank you in advanced.

      Regards,
      rozita


      Comment


      • #4
        I get it. I need command:

        . tsset idc bcaldate

        Now can proceed to the next stage. Thank you.

        Comment

        Working...
        X