Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stock filter screen

    I need clarification about a few things that are written below. I have a daily stock price dataset from 01-07-2016 to 30-06-2018, i.e., 24 months. The first 12 months, i.e. 01-07-2016 to 30-06-2017 is for selecting of liquid stocks. The second 12 months, i.e. 01-07-2017 to 30-06-2018 is for estimation purpose. Using first 12 months of data, i want to drop stocks which do no meet a certain liquidity criteria, i.e., number of days of trade. Say, if a stock does not have at least 200 pr values during 01-07-2016 to 30-06-2017 should be dropped from the sample. Moreover, i need the number of stocks that are dropped using the given criteria. I tried using these few codes. Just want to confirm if they are rightly specified and if someone make them more precise. The data example is also appended.

    The last command, i.e. drop need particular attention. What it should be to ensure that using nobs<=200 in year first, stocks have to be dropped in year second.


    Code:
    //for generating no of observation per company per year
    bys stock_id : egen nobs= count(pr) if mdate<=ym(2017,6)
    
    //number of illiquid stocks filtered per month
    by stock_id: gen excl= _N if nobs<=200 & mdate<=ym(2017,6)
    
    //for dropping of illiquid stocks
    drop if nobs<=200
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float stock_id str52 stock int date float(mdate year) double pr
    1 "20 Microns Ltd." 20636 678 2016 30.05
    1 "20 Microns Ltd." 20639 678 2016  29.8
    1 "20 Microns Ltd." 20640 678 2016 30.35
    1 "20 Microns Ltd." 20642 678 2016  29.8
    1 "20 Microns Ltd." 20643 678 2016  29.9
    1 "20 Microns Ltd." 20646 678 2016    30
    1 "20 Microns Ltd." 20647 678 2016  29.6
    1 "20 Microns Ltd." 20648 678 2016  29.4
    1 "20 Microns Ltd." 20649 678 2016 29.05
    1 "20 Microns Ltd." 20650 678 2016 29.25
    1 "20 Microns Ltd." 20653 678 2016 28.65
    1 "20 Microns Ltd." 20654 678 2016  28.5
    1 "20 Microns Ltd." 20655 678 2016  28.6
    1 "20 Microns Ltd." 20656 678 2016 28.45
    1 "20 Microns Ltd." 20657 678 2016 28.55
    1 "20 Microns Ltd." 20660 678 2016 28.95
    1 "20 Microns Ltd." 20661 678 2016 28.85
    1 "20 Microns Ltd." 20662 678 2016  29.8
    1 "20 Microns Ltd." 20663 678 2016 30.15
    1 "20 Microns Ltd." 20664 678 2016  29.3
    1 "20 Microns Ltd." 20667 679 2016 28.85
    1 "20 Microns Ltd." 20668 679 2016 29.35
    1 "20 Microns Ltd." 20669 679 2016  29.8
    1 "20 Microns Ltd." 20670 679 2016  29.9
    1 "20 Microns Ltd." 20671 679 2016 29.85
    1 "20 Microns Ltd." 20674 679 2016  35.7
    1 "20 Microns Ltd." 20675 679 2016    39
    1 "20 Microns Ltd." 20676 679 2016 39.45
    1 "20 Microns Ltd." 20677 679 2016  39.6
    1 "20 Microns Ltd." 20678 679 2016 39.85
    1 "20 Microns Ltd." 20682 679 2016  39.2
    1 "20 Microns Ltd." 20683 679 2016  38.6
    1 "20 Microns Ltd." 20684 679 2016 39.45
    1 "20 Microns Ltd." 20685 679 2016 39.25
    1 "20 Microns Ltd." 20688 679 2016  39.2
    1 "20 Microns Ltd." 20689 679 2016  38.9
    1 "20 Microns Ltd." 20690 679 2016 39.35
    1 "20 Microns Ltd." 20691 679 2016  38.6
    1 "20 Microns Ltd." 20692 679 2016  38.7
    1 "20 Microns Ltd." 20695 679 2016  37.7
    1 "20 Microns Ltd." 20696 679 2016 38.05
    1 "20 Microns Ltd." 20697 679 2016  38.2
    1 "20 Microns Ltd." 20698 680 2016  36.3
    1 "20 Microns Ltd." 20699 680 2016 37.45
    1 "20 Microns Ltd." 20703 680 2016 36.95
    1 "20 Microns Ltd." 20704 680 2016 37.45
    1 "20 Microns Ltd." 20705 680 2016 37.55
    1 "20 Microns Ltd." 20706 680 2016 37.55
    1 "20 Microns Ltd." 20709 680 2016 36.65
    1 "20 Microns Ltd." 20711 680 2016  36.7
    1 "20 Microns Ltd." 20712 680 2016 36.65
    1 "20 Microns Ltd." 20713 680 2016 37.45
    1 "20 Microns Ltd." 20716 680 2016 36.85
    1 "20 Microns Ltd." 20717 680 2016    37
    1 "20 Microns Ltd." 20718 680 2016 36.75
    1 "20 Microns Ltd." 20719 680 2016 38.95
    1 "20 Microns Ltd." 20720 680 2016  38.3
    1 "20 Microns Ltd." 20723 680 2016 38.15
    1 "20 Microns Ltd." 20724 680 2016  38.3
    1 "20 Microns Ltd." 20725 680 2016 38.25
    1 "20 Microns Ltd." 20726 680 2016 35.15
    1 "20 Microns Ltd." 20727 680 2016 36.45
    1 "20 Microns Ltd." 20730 681 2016  36.7
    1 "20 Microns Ltd." 20731 681 2016  36.2
    1 "20 Microns Ltd." 20732 681 2016  36.4
    1 "20 Microns Ltd." 20733 681 2016  36.5
    1 "20 Microns Ltd." 20734 681 2016  36.4
    1 "20 Microns Ltd." 20737 681 2016  35.7
    1 "20 Microns Ltd." 20740 681 2016  35.2
    1 "20 Microns Ltd." 20741 681 2016  37.8
    1 "20 Microns Ltd." 20744 681 2016 38.05
    1 "20 Microns Ltd." 20745 681 2016 38.55
    1 "20 Microns Ltd." 20746 681 2016    39
    1 "20 Microns Ltd." 20747 681 2016    39
    1 "20 Microns Ltd." 20748 681 2016 39.45
    1 "20 Microns Ltd." 20751 681 2016  39.9
    1 "20 Microns Ltd." 20752 681 2016  39.6
    1 "20 Microns Ltd." 20753 681 2016 39.65
    1 "20 Microns Ltd." 20754 681 2016  38.5
    1 "20 Microns Ltd." 20755 681 2016 39.95
    1 "20 Microns Ltd." 20757 681 2016 40.65
    1 "20 Microns Ltd." 20759 682 2016 41.75
    1 "20 Microns Ltd." 20760 682 2016    40
    1 "20 Microns Ltd." 20761 682 2016 42.55
    1 "20 Microns Ltd." 20762 682 2016    41
    1 "20 Microns Ltd." 20765 682 2016 43.15
    1 "20 Microns Ltd." 20766 682 2016 42.15
    1 "20 Microns Ltd." 20767 682 2016  40.8
    1 "20 Microns Ltd." 20768 682 2016  40.9
    1 "20 Microns Ltd." 20769 682 2016 39.05
    1 "20 Microns Ltd." 20773 682 2016 36.15
    1 "20 Microns Ltd." 20774 682 2016 36.15
    1 "20 Microns Ltd." 20775 682 2016    34
    1 "20 Microns Ltd." 20776 682 2016 34.55
    1 "20 Microns Ltd." 20779 682 2016 32.35
    1 "20 Microns Ltd." 20780 682 2016  32.6
    1 "20 Microns Ltd." 20781 682 2016 34.35
    1 "20 Microns Ltd." 20782 682 2016  34.7
    1 "20 Microns Ltd." 20783 682 2016 35.45
    1 "20 Microns Ltd." 20786 682 2016 35.75
    end
    format %td date
    format %tm mdate
    .
    Last edited by Sartaj Hussain; 09 Oct 2022, 11:49.

  • #2
    The example data provided contains only one stock_id and only year 2016. Actually, any example data set that would properly allow testing would be too large to post here. So in the code below, I create a toy data set with the relevant variables and properties.

    The code you show is not correct. The problem is that the original -egen nobs = count(pr) if mdate <= ym(2017, 6)- only populates the nobs variable in the fy 2016 observations, so the subsequent commands, which look at observations from the later year, don't "see" its results. Here's how I would do this:

    Code:
    //  CREATE A DEMONSTRATION DATA SET
    clear*
    set obs 2
    gen year = 2015 + _n
    expand 5
    by year, sort: gen stock_id = _n
    expand 365
    by stock_id year, sort: gen date = mdy(6, 30, year) + _n
    format date %td
    gen mdate = mofd(date)
    format mdate %td
    set seed 1234
    gen pr = runiform()
    replace pr = . if inlist(stock_id, 1, 3, 5) & runiform() < .5
    replace pr = . if inlist(stock_id, 2, 4) & runiform() < 0.2
    tab stock_id year if !missing(pr)
    
    
    //  IDENTIFY STOCKS WITH FEWER THAN 200 OBS IN FY 2017
    isid stock_id date, sort
    by stock_id: egen nobs = count(pr) if mdate <= ym(2017, 6)
    
    //  COUNT & TABULATE THE OBSERVATIONS TO BE DROPPED
    by stock_id: egen byte to_drop = max(nobs < 200)
    version 16: table stock_id year if to_drop, c(count pr)
    
    //  DROP THEM
    drop if to_drop
    tab stock_id year
    Note: My demonstration data differs in organization from yours in that the absence of price data is represented by pr having missing values; whereas in your example it is represented by the absence of any observation for the date. But my code will work the same way in either case.

    Comment


    • #3
      #2 is fine. But i need following information also.

      1. Total Number of stocks in the estimation year.
      2. Number of stocks excluded.
      3. Percentage of stocks excluded.

      Besides this, is it possible to generate following table showing distribution of percentiles of trading days. Specimen of table is appended.

      Click image for larger version

Name:	liq_table.PNG
Views:	1
Size:	7.8 KB
ID:	1685000

      Comment


      • #4
        The following includes a larger demonstration data set (you can't compute deciles of trading days on 5 firms), and it shows everything you ask. The layout of the table of deciles is not as you show, but the information is there.

        Code:
        //  CREATE A DEMONSTRATION DATA SET
        clear*
        set obs 2
        gen year = 2015 + _n
        expand 3000
        by year, sort: gen stock_id = _n
        expand 365
        by stock_id year, sort: gen date = mdy(6, 30, year) + _n
        format date %td
        gen mdate = mofd(date)
        format mdate %tm
        gen fyear = yofd(date + 184)
        set seed 1234
        gen pr = runiform()
        replace pr = . if mod(stock_id, 3) == 1 & runiform() < .5
        replace pr = . if mod(stock_id, 3) == 2 & runiform() < 0.2
        
        
        //  IDENTIFY STOCKS WITH FEWER THAN 200 OBS IN FY 2017
        isid stock_id date, sort
        by stock_id: egen nobs = count(pr) if mdate <= ym(2017, 6)
        
        //  COUNT & TABULATE THE stock_ids TO BE DROPPED
        by stock_id: egen byte to_drop = max(nobs < 200)
        label define to_drop    0   "Will be Retained"  1   "Will be Dropped"
        label values to_drop to_drop
        
        //  DISPLAY OF NUMBERS OF STOCKS DROPPED & RETAINED
        egen flag = tag(stock_id year)  
        tab to_drop if flag & fyear == 2018
        
        //  TABLE OF DECILES OF TRADING DAYS
        centile nobs if flag & fyear == 2017, centile(10(10)90)
        
        //  DROP THEM
        drop if to_drop

        Comment


        • #5
          The years 2018, 2017 should be from July through June next in the following. For example for 2017, the year runs through July 2016 to June 2017 and for 2018, it is July 2017 to June 2018.

          Code:
           
           tab to_drop if flag & fyear == 2018  centile nobs if flag & fyear == 2017, centile(10(10)90)

          Comment


          • #6
            The years 2018, 2017 should be from July through June next in the following. For example for 2017, the year runs through July 2016 to June 2017 and for 2018, it is July 2017 to June 2018.
            That is what the code does:

            Code:
               fyear |       Min       Max
            ---------+--------------------
                2017 |    2016m7    2017m6
                2018 |    2017m7    2018m6
            ---------+--------------------
               Total |    2016m7    2018m6
            ------------------------------

            Comment

            Working...
            X