Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Data combing

    Hi I'm a beginner to stata. I'd like to run an OLSNO (later on I will try GMM as well as Newey HAC) because overlapping observations will generate a moving average error term. I have daily data from 1989-2015 (~6000+ observations for 8 variables) and I'm hoping to generate variables with only the values of the second to last business day of every month. I'm wondering if there is a faster command than individually copying and pasting points in excel?

  • #2
    I found a solution that worked for me. Luckily I had a date variable:

    gen Date1=date(date,"YMD")
    gen d=day(Date1)
    gen m=month(Date1)
    keep if d==26|d==27|d==28|d==29|d==30

    Unfortunately still requires a bit of manual labor but its worth it.

    Comment


    • #3
      The last business day of the month will sometimes be on the 31st day.

      Here's another take.

      First simple tip: The last day of the month is defined by the fact that the next day is the first of the next month. That's much easier than thinking up code for months variously 28, 29, 30, 31 days long.

      Second: If today is Monday to Friday and tomorrow is the 1st of the next month, we are OK.

      But if the 1st of the next month is Sunday or Monday, we need the previous Friday.

      Here's a sandbox and some code.

      Code:
       
      . clear
      
      . * 2015, but include 1 Jan 2016 as well 
      . set obs 366 
      obs was 0, now 366
      
      . gen date = mdy(12,31,2014) + _n
      
      . format date %td
      
      . gen dow = dow(date) 
      
      . gen select = day(date[_n+1]) == 1 & inrange(dow(date), 1,5)
      
      . l if select
      
           +--------------------------+
           |      date   dow   select |
           |--------------------------|
       90. | 31mar2015     2        1 |
      120. | 30apr2015     4        1 |
      181. | 30jun2015     2        1 |
      212. | 31jul2015     5        1 |
      243. | 31aug2015     1        1 |
           |--------------------------|
      273. | 30sep2015     3        1 |
      334. | 30nov2015     1        1 |
      365. | 31dec2015     4        1 |
           +--------------------------+
      
      . replace select = (day(date[_n+2]) == 1 | day(date[_n+3]) == 1)  & dow(date) == 5 if select == 0 
      (4 real changes made)
      
      . l if select
      
           +--------------------------+
           |      date   dow   select |
           |--------------------------|
       30. | 30jan2015     5        1 |
       58. | 27feb2015     5        1 |
       90. | 31mar2015     2        1 |
      120. | 30apr2015     4        1 |
      149. | 29may2015     5        1 |
           |--------------------------|
      181. | 30jun2015     2        1 |
      212. | 31jul2015     5        1 |
      243. | 31aug2015     1        1 |
      273. | 30sep2015     3        1 |
      303. | 30oct2015     5        1 |
           |--------------------------|
      334. | 30nov2015     1        1 |
      365. | 31dec2015     4        1 |
           +--------------------------+
      We don't need to calculate a day of the week variable; it's just there to check on what was done.

      I don't know anything about Stata's business calendars beyond their existence!
      Last edited by Nick Cox; 25 Feb 2015, 11:56.

      Comment

      Working...
      X