Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Locate firms that do not have value for a perticular variable within certain time frame

    Hello!

    I have some data that has firm ID (firm), Rdate (reply date), Edate (event date) and a dummy first_reply to indicate whether the reply is the first reply after the event date. After identifying the first reply after the event, I want to create another dummy, NoOtherReply2_7, which would equal to 1 if there are no other Rdate 2 to 7 days after the first Rdate I identified, that is to say, after the first firm reply after the event happened, there are no other firm replies 2 to 7 days after the first reply.

    In the example data below, for firm 150, the event happened on 04feb2020, first reply occur in 14feb2020, the next reply is in 02mar2020, so the second reply after the first reply happened after 2 to 7 days after the first reply, and I want NoOtherReply2_7 to set to 1.

    For firm 153, the event happened on 04feb2020, first reply occur in 12feb2020, there are several replies on the same day (ignore these, only start from 2 days after first reply date), and the next reply (not on the same day) is 13feb2020, also ignore this as I focus on the (2,7) period, then the next reply is on 17feb2020, that is within the (2,7) period of 12feb2020, so NoOtherReply2_7 dummy should be set as 0.

    I hope I have clearly explained what I intended to achieve, the reason why I only focus on 2 to 7 days after the first Rdate is that I also look at the 3-day period around the first Rdate, that is, the day before, the day of, and the day after first Rdate, so for this particular purpose, I only start with 2 days after.

    Thanks a lot for any suggestion!

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double firm float(Rdate Edate first_reply)
    150 21959 21949 1
    150 21976 21949 0
    150 21976 21949 0
    150 21976 21949 0
    150 21976 21949 0
    150 21976 21949 0
    150 21976 21949 0
    150 21984 21949 0
    150 22001 21949 0
    150 22001 21949 0
    150 22001 21949 0
    150 22001 21949 0
    150 22001 21949 0
    150 22001 21949 0
    150 22001 21949 0
    150 22001 21949 0
    150 22001 21949 0
    150 22019 21949 0
    150 22019 21949 0
    150 22019 21949 0
    150 22019 21949 0
    150 22019 21949 0
    150 22019 21949 0
    150 22019 21949 0
    150 22057 21949 0
    150 22057 21949 0
    150 22057 21949 0
    150 22057 21949 0
    150 22061 21949 0
    150 22061 21949 0
    150 22061 21949 0
    153 21921 21949 0
    153 21921 21949 0
    153 21936 21949 0
    153 21957 21949 1
    153 21957 21949 0
    153 21957 21949 0
    153 21957 21949 0
    153 21957 21949 0
    153 21957 21949 0
    153 21957 21949 0
    153 21957 21949 0
    153 21957 21949 0
    153 21957 21949 0
    153 21957 21949 0
    153 21957 21949 0
    153 21957 21949 0
    153 21957 21949 0
    153 21957 21949 0
    153 21957 21949 0
    153 21958 21949 0
    153 21958 21949 0
    153 21958 21949 0
    153 21962 21949 0
    153 21963 21949 0
    153 21963 21949 0
    153 21963 21949 0
    153 21966 21949 0
    153 21966 21949 0
    153 21971 21949 0
    153 21971 21949 0
    153 21971 21949 0
    153 21971 21949 0
    153 21971 21949 0
    153 21971 21949 0
    153 21973 21949 0
    153 21973 21949 0
    153 21973 21949 0
    153 21976 21949 0
    153 21976 21949 0
    153 21976 21949 0

  • #2
    Code:
    format Rdate Edate %td
    
    //    VERIFY NECESSARY ASSUMPTIONS
    by firm(Edate), sort: assert Edate[1] == Edate[_N]
    by firm: egen check = total(first_reply)
    assert check == 1 & inlist(first_reply, 0, 1)
    drop check
    
    
    //    CALCULATE COUNT OF REPLIES THAT ARE WITHIN 2 TO 7 DAYS
    //    FOLLOWING FIRST REPLY DATE
    by firm: egen first_reply_date = max(cond(first_reply, Rdate, .))
    format first_reply_date %td
    by firm: egen n_in_window_replies ///
        = total(inrange(Rdate-first_reply_date, 2, 7))
    
    //    CREATE AN INDICATOR VARIABLE
    gen byte no_other_reply_2_7 = n_in_window_replies == 0
    For this code to produce correct results, certain assumptions about the data must be met. There must be only one, constant value of Edate for each firm. There must be only one observation designated as the first reply. And the first_reply variable must be a 0/1 variable with no missing values or other numeric values. All of these assumptions are verified at the start of the code.

    Comment


    • #3
      Hi Clyde, thanks for your help, the code works perfectly.

      Comment

      Working...
      X