Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Keeping Observations with Minimum Value in Panel Data

    Hi Statalist users,
    I have intraday trading data for 100 stocks for 100 days. Each day contains several hundred trades ordered by time stamps (to the millisecond). At the beginning of trading each day, there are several trades stamped to the exact same millisecond (so these are the first set of trades for the day). I would like to keep only those trades.

    My data is laid out as:
    Instrument Date Time Trade Volume Price
    BHP 20051029 10:00:00.000 250 1.57
    BHP 20051029 10:00:00.000 333 1.57
    BHP 20051029 10:14:00.079 456 1.66
    BHP 20051029 10:19:47.016 92 1.68
    BHP 20051029 10:23:17.001 56 1.54
    BHP 20051029 10:33:27.165 387 1.57
    CBA 20051029 10:00:00.000 100 1.34
    CBA 20051029 10:00:00.000 99 1.34
    CBA 20051029 10:00:00.000 555 1.34
    CBA 20051029 10:15:32.003 600 1.35
    CBA 20051029 10:17:55.004 10 1.34
    CBA 20051029 10:17:55.004 999 1.33
    So in this case I would like to keep only the first 2 trades of BHP and the first 3 trades of CBA.

    Could someone please suggest a way to code this?


    Thanks,

    Alex

  • #2
    Depends on what type of variable your 'Time' var is.

    If ti is a string (red in the data editor):
    Code:
    keep if time=="10:00:00.000"
    If its a numeric time variable:
    Code:
    keep if time==0000000?
    Where 0000000? should be the numeric value you can find in your data editor, which Stata displays as "10:00:00.000". This woudl be the number of milliseconds since midnight.

    To make it eaiser for people to see such differences, please post data examples using dataex in future posts, see also the explanation on how and why in the FAQ (http://www.statalist.org/forums/help#stata).
    This makes it a lot easier for people on the forum to copy your data into Stata.
    Code:
    ssc install dataex
    dataex

    Comment


    • #3
      Hi Alex,
      and welcome to Statalist!

      I second Jorrit Gosens' recommendation to use -dataex- in the future. His solution is the straightforward way to solve the problem you showed to us.

      However, it may be the case that the minimum timestamp for a certain day may not be at "10:00:00.000", but "10:00:00.001". If this is the case, you would need to identify the time minimum for each day, and keep all observations matching this minimum:
      Code:
      bysort Instrument Date: egen mintimeperday=min(Time)
      keep if Time==mintimeperday
      This, of course, only works if your variable "Time" is a numeric variable. If not, you have to convert it -- show us a real sample of the data using -dataex-, and we can try to help you with this.

      Regards
      Bela

      Comment


      • #4
        Hey Alex,

        sounds like you might want to familiarize yourself with date and time variables and how to deal with them in Stata - considering you will probably need to handle times as numeric values. You might want to have a look at this excellent overview:
        https://web.stanford.edu/group/ssds/...in%20Stata.pdf

        Best regards,
        Boris

        Comment


        • #5
          Note that the advice in #3 should be qualified. Using egen here will typically meaning putting the minimum times in a float variable. You should always want to use a double variable.

          Comment


          • #6
            Hi everyone,

            Thank you for your quick replies and warm welcome.

            Your help is highly appreciated. Jorrit you were correct in assuming it was a string variable. That was my issue.

            Thanks again for the help.

            Comment

            Working...
            X