Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • The wofd() function returns an inconsistent week id???

    Hello everyone,
    As the title suggests, I've noticed that the week IDs aren't matching the date variable correctly. Specifically:
    • My goal is to create a unique variable that identifies a week (from a variable containing transaction date information) accurately and consistently.
    • From the variable date formatted as %td, I create a variable w containing the week ID (=1 for 1960w1, =2 for 1960w2, ...) and a variable dow containing the IDs of the days of the week (=1 for Monday, =6 for Saturday, and =0 for Sunday).
    • At the end of the data, the values ​​of w and dow match in the sense that days of the same week will have the same value of w (as in figure 1). But this is no longer true at the beginning of the data (and seemingly in the middle as well) as in figure 2.
    I'm not sure if this is a buff or a change in the general convention of determining which day of the week should start. If I want to achieve consistency like the result in image 1 for the entire sample, how should I do it?
    All comments are appreciated.
    Below is the data along with the code and related figures.
    Code:
    . import excel "Bitcoin ", sheet(Sheet2) first case(lower) clear
    (3 vars, 4,999 obs)
    
    . desc
    
    Contains data
     Observations:         4,999                  
        Variables:             3                  
    -----------------------------------------------------------------------------------------------
    Variable      Storage   Display    Value
        name         type    format    label      Variable label
    -----------------------------------------------------------------------------------------------
    date            int     %td..                 date
    price           str8    %9s                   price
    btc             double  %10.0g                btc
    -----------------------------------------------------------------------------------------------
    Sorted by: 
         Note: Dataset has changed since last saved.
    
    . gen w = wofd(date)
    
    . gen dow = dow(date)
    Click image for larger version

Name:	1.PNG
Views:	1
Size:	70.1 KB
ID:	1785594
    Click image for larger version

Name:	2.PNG
Views:	1
Size:	47.0 KB
ID:	1785595

    Attached Files
    Manh Hoang-Ba,
    Facebook,
    Eureka! Uni - YouTube,
    ManhHB94 (Manh Hoang Ba),
    Hoàng Bá Mạnh – Kinh tế lượng: Lý thuyết và ứng dụng

  • #2
    Week numbering in Stata is not tied to days of the week. For more, see pieces found by

    Code:
    search week, sj

    Comment


    • #3
      I think you want something like:
      Code:
      clear
      local start_date = date("2010-07-18", "YMD")
      local end_date = date("2024-03-17", "YMD")
      local num_days = `end_date' - `start_date'
      set obs `num_days'
      egen double  date = seq() , from(`start_date') to(`end_date') 
      format date %d
      gen w = wofd(date)
      gen dow = dow(date)
      
      // mark reset at first observation
      gen byte reset = 0
      replace reset = 1 in 1
      
      // mark reset: either x==0 (explicit start) OR value not greater than previous (handles starts mid-cycle)
      replace reset = 1 if _n>1 & (dow==0 | dow <= dow[_n-1])
      
      // cumulative cycle number
      gen long  my_week = sum(reset)
      
      list date w dow  if inlist(w,  2628, 2629 /*, 3337, 3338*/) , sepby(w)
      list date dow  my_week if inlist(my_week, 1,2) , sepby(my_week)
      mycalendar , year(2010) month(7)
      Which returns:
      Code:
      . list date w dow  if inlist(w,  2628, 2629) , sepby(w)
      
            +------------------------+
            |      date      w   dow |
            |------------------------|
         1. | 18jul2010   2628     0 |
         2. | 19jul2010   2628     1 |
         3. | 20jul2010   2628     2 |
         4. | 21jul2010   2628     3 |
         5. | 22jul2010   2628     4 |
            |------------------------|
         6. | 23jul2010   2629     5 |
         7. | 24jul2010   2629     6 |
         8. | 25jul2010   2629     0 |
         9. | 26jul2010   2629     1 |
        10. | 27jul2010   2629     2 |
        11. | 28jul2010   2629     3 |
        12. | 29jul2010   2629     4 |
            +------------------------+
      
      . list date dow  my_week if inlist(my_week, 1,2) , sepby(my_week)
      
            +---------------------------+
            |      date   dow   my_week |
            |---------------------------|
         1. | 18jul2010     0         1 |
         2. | 19jul2010     1         1 |
         3. | 20jul2010     2         1 |
         4. | 21jul2010     3         1 |
         5. | 22jul2010     4         1 |
         6. | 23jul2010     5         1 |
         7. | 24jul2010     6         1 |
            |---------------------------|
         8. | 25jul2010     0         2 |
         9. | 26jul2010     1         2 |
        10. | 27jul2010     2         2 |
        11. | 28jul2010     3         2 |
        12. | 29jul2010     4         2 |
        13. | 30jul2010     5         2 |
        14. | 31jul2010     6         2 |
            +---------------------------+
      
      . mycalendar , year(2010) month(7)
      ------------------------------------
         Calendar for 2010 - 07
      ------------------------------------
      Sun  Mon  Tue  Wed  Thu  Fri  Sat
                           01   02   03   
       04   05   06   07   08   09   10   
       11   12   13   14   15   16   17   
       18   19   20   21   22   23   24   
       25   26   27   28   29   30   31   
      ------------------------------------
      And printing out the calendar:
      Code:
      program define mycalendar, rclass
          version 18.0
        syntax , [Year(integer 1) Month(integer 1)]
      
          // Validate month
          if `month' < 1 | `month' > 12 {
              di as err "month must be 1..12"
              exit 198
          }
          local y = `year'
          local m = `month'
      
          // Number of days in month (handles leap year)
          tempvar firstdate
          quietly {
              gen double `firstdate' = mdy(`m',1,`y') if _n==1
          }
          local first_dow = day(`firstdate') // 1..31 (we'll instead compute weekday below)
          // compute weekday of the first day: dow(date) returns 0=Sunday..6=Saturday for Stata dates
          local first_wd = dow(mdy(`m',1,`y'))    // 0..6 (Sunday..Saturday)
          // days in month: mofd of next month minus this month's mofd
          local next_m = `m' + 1
          local next_y = `y'
          if `next_m' == 13 {
              local next_m = 1
              local next_y = `y' + 1
          }
          local days = mdy(`next_m',1,`next_y') - mdy(`m',1,`y')
          // Print header
          di as txt "{hline 36}"
          di as txt "   Calendar for `:display %04.0f `y'' - `:display %02.0f `m''"
          di as txt "{hline 36}"
          di as txt "Sun  Mon  Tue  Wed  Thu  Fri  Sat"
          // Prepare leading spaces for first week
          local wd = `first_wd' // 0..6
          local col = 0
          forvalues i = 1/`wd' {
              // print three spaces + one space for alignment
              display "    "  , _continue 
      
              local col = `col' + 1
          }
          // Print all days
      
          forvalues d = 1/`days' {
              // Print day with width 3, right-aligned, then a space
              local s = string(`d', "%02.0f")
              display " `s' " ,    _continue  
      
              local col = `col' + 1
              if `col' == 7 {
                  // end of week
                  display "" , 
                  local col = 0
              }
          }
          if `col' != 0 {
              display ""  
          }
          di as txt "{hline 36}"
          // return scalars
          return scalar year = `y'
          return scalar month = `m'
          return scalar days = `days'
      end

      Comment


      • #4
        Thanks to Nick Cox for the additional information on date-time functions in STATA.
        And thanks to Scott Merryman , that's exactly what I was looking for.
        Manh Hoang-Ba,
        Facebook,
        Eureka! Uni - YouTube,
        ManhHB94 (Manh Hoang Ba),
        Hoàng Bá Mạnh – Kinh tế lượng: Lý thuyết và ứng dụng

        Comment


        • #5
          As Nick Cox mentioned, wofd() generates week IDs that are independent of the days of the week.
          More specifically, it defines a year as having only 52 weeks starting from January 1st and assigns IDs to the 52 weeks from there; the last days of the year are grouped into the 52nd week.
          Therefore, the 52nd week has 8 days in a regular year (e.g., 2021) and 9 days in a leap year (e.g., 2020).

          Edit: The old code was missing December 31st, 2021, so I've revised it to include it.

          Code:
          . clear
          
          . local start_date = date("2020-01-01", "YMD")
          
          . local end_date = date("2022-01-01", "YMD")
          
          . local num_days = `end_date' - `start_date'
          
          . qui set obs `num_days'
          
          . egen double  date = seq() , from(`start_date') to(`end_date') 
          
          . format date %d
          
          . gen w = wofd(date)
          
          . gen dow = dow(date)
          
          . list date w dow if w==3171, sepby(w)
          
               +------------------------+
               |      date      w   dow |
               |------------------------|
          358. | 23dec2020   3171     3 |
          359. | 24dec2020   3171     4 |
          360. | 25dec2020   3171     5 |
          361. | 26dec2020   3171     6 |
          362. | 27dec2020   3171     0 |
          363. | 28dec2020   3171     1 |
          364. | 29dec2020   3171     2 |
          365. | 30dec2020   3171     3 |
          366. | 31dec2020   3171     4 |
               +------------------------+
          
          . list date w dow if w==3223, sepby(w)
          
               +------------------------+
               |      date      w   dow |
               |------------------------|
          724. | 24dec2021   3223     5 |
          725. | 25dec2021   3223     6 |
          726. | 26dec2021   3223     0 |
          727. | 27dec2021   3223     1 |
          728. | 28dec2021   3223     2 |
          729. | 29dec2021   3223     3 |
          730. | 30dec2021   3223     4 |
          731. | 31dec2021   3223     5 |
               +------------------------+
          Last edited by Manh Hoang Ba; 06 Apr 2026, 03:12.
          Manh Hoang-Ba,
          Facebook,
          Eureka! Uni - YouTube,
          ManhHB94 (Manh Hoang Ba),
          Hoàng Bá Mạnh – Kinh tế lượng: Lý thuyết và ứng dụng

          Comment


          • #6
            In case it helps anyone, here are the references pointed to by #2

            Code:
            . search week, sj
            
            Search of official help files, FAQs, Examples, and Stata Journals
            
            SJ-25-2 dm0116  . . Speaking Stata: Nine notes on dealing with dates and times
                    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
                    Q2/25   SJ 25(2):471--483                                (no commands)
                    overview of several key points in dealing with date and time
                    data in Stata
            
            SJ-22-2 dm0107_1  . . .  Erratum: Stata tip 145: Numbering weeks within months
                    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
                    Q2/22   SJ 22(2):465--466                                (no commands)
                    errata for tip on numbering weeks within months
            
            SJ-22-1 dm0107  . . . . . . . . . Stata tip 145: Numbering weeks within months
                    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
                    Q1/22   SJ 22(1):224--230                                (no commands)
                    tip on numbering weeks within months
            
            SJ-19-3 dm0100  . . . . . . . . . .  Speaking Stata: The last day of the month
                    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
                    Q3/19   SJ 19(3):719--728                                (no commands)
                    discusses three related problems about getting the last day
                    of the month in a new variable
            
            SJ-12-4 dm0065_1  . . . . . Stata tip 111: More on working with weeks, erratum
                    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
                    Q4/12   SJ 12(4):765                                     (no commands)
                    lists previously omitted key reference
            
            SJ-12-3 dm0065  . . . . . . . . . .  Stata tip 111: More on working with weeks
                    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
                    Q3/12   SJ 12(3):565--569                                (no commands)
                    discusses how to convert data presented in yearly and weekly
                    form to daily dates and how to aggregate such data to months
                    or longer intervals
            
            SJ-10-4 dm0052  . . . . . . . . . . . . . . . . Stata tip 68: Week assumptions
                    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
                    Q4/10   SJ 10(4):682--685                                (no commands)
                    tip on Stata's solution for weeks and on how to set up
                    your own alternatives given different definitions of the
                    week
            To that should perhaps be added:

            Code:
            SJ-24-4 st0764  . . . . . . . . . . . Stata tip 158: The devil is in the delta
                    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
                    Q4/24   SJ 24(4):777--783                                (no commands)
                    underlines that the delta() option of tsset or xtset is
                    essential in many cases, even though as the term implies,
                    it is optional as a matter of syntax


            Here is a very personal summary of the situation for Stata users.

            1. Stata's own week numbering is as Manh Hoang Ba explains based on week 1 of a (Western calendar) year always starting on 1 January and week 52 always being 8 or 9 days long and ending on 31 December. I've never seen reports of this numbering being used outside Stata or that it was ideal for anyone's analysis.

            2. Otherwise there may be conventions you need or want to work such as epiweeks in epidemiology.

            3. Otherwise choose your own definition based on weeks starting on a particular day of the week, or equivalently ending on a different particular day. The best of both worlds is often a numbering startiing at 1 when your data do and a label that is a date string for the beginning or end of the week, whichever carries resonance for the researcher.

            4. But some data are just daily data issued 7 days apart, in which case Stata's week machinery doesn't have to be used at all, but declaring the data as such using tsset, delta() or xtset, delta() solves most problems.

            Comment


            • #7
              Scott Merryman : In #3 don't understand how you can use the command -mycalendar-. As far as I can see your code does not create it nor does -help mycalendar- or -search mycalender- produce any useful result.

              Comment


              • #8
                Sorry, please ignore my post, I did not read all of #3: You explicitly define the program -mycalendar-.

                Comment

                Working...
                X