Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Converting observations into weekly time series data

    Hello,

    I am trying to convert observations into weekly time series data. My dataset includes date and count variables, as seen in the attached file. The data ranges from 1/1/2018 to 12/31/2018. I want to sum the number of counts on a weekly basis. Any help would be appreciated. Thank you.
    Last edited by DY Kim; 30 Jun 2023, 15:50.

  • #2
    You can try:

    Code:
    drop if mi(count)
    desc
    gen edate=date(date, "MDY")
    format edate %d        
    g week = week(edate)
    gen year=year(edate)
    gen tss = yw(year,week)
    format tss %tw
    collapse (sum) count, by(tss)
    See also the Stata tip 111 by Nicholas J. Cox https://journals.sagepub.com/doi/pdf...867X1201200316

    Comment


    • #3
      Please note our advice against attachments at https://www.statalist.org/forums/help#stata 12.2

      There are 3298 entirely empty observations in the dataset. @Miguel Henry's code gets rid of them.

      Thanks for the mention in #2. A fuller list of references is

      Code:
      . search week, sj
      
      Search of official help files, FAQs, Examples, and Stata Journals
      
      SJ-22-2 dm0107_1  . . .  Erratum: Stata tip 145: Numbering weeks within months
              . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
              Q2/22   SJ 22(2):465--466                                (no commands)
              errata for tip on numbering weeks within months
      
      SJ-22-1 dm0107  . . . . . . . . . Stata tip 145: Numbering weeks within months
              . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
              Q1/22   SJ 22(1):224--230                                (no commands)
              tip on numbering weeks within months
      
      SJ-19-3 dm0100  . . . . . . . . . .  Speaking Stata: The last day of the month
              . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
              Q3/19   SJ 19(3):719--728                                (no commands)
              discusses three related problems about getting the last day
              of the month in a new variable
      
      SJ-12-4 dm0065_1  . . . . . Stata tip 111: More on working with weeks, erratum
              . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
              Q4/12   SJ 12(4):765                                     (no commands)
              lists previously omitted key reference
      
      SJ-12-3 dm0065  . . . . . . . . . .  Stata tip 111: More on working with weeks
              . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
              Q3/12   SJ 12(3):565--569                                (no commands)
              discusses how to convert data presented in yearly and weekly
              form to daily dates and how to aggregate such data to months
              or longer intervals
      
      SJ-10-4 dm0052  . . . . . . . . . . . . . . . . Stata tip 68: Week assumptions
              . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
              Q4/10   SJ 10(4):682--685                                (no commands)
              tip on Stata's solution for weeks and on how to set up
              your own alternatives given different definitions of the
              week
      which is less reading than it looks (sorry about the errata). I'd recommend starting with Tip 68.

      @Miguel Henry's solution can be shortened to (in essence, but you will want sensible display formats)


      Code:
      gen ddate = daily(date, "MDY") 
      gen wdate = wofd(ddate)
      but the weeks so described are Stata weeks in which (for the 2018 dates in the dataset) week 52 is 8 days long and stretches from 24 Dec to 31 Dec.

      If you don't mind that, or do want it, then you're good. If you want another definition of week, you have a reading list.


      Comment


      • #4
        Thank you, Miguel and Nick.

        I also deal with data with many zero weekly counts. Would you please advise how to display weeks with zero counts, as well as those with non-zero counts, in the data? The suggested code identified weeks with non-zero counts only.
        Last edited by DY Kim; 01 Jul 2023, 08:05.

        Comment


        • #5
          Data example please

          Comment


          • #6
            Originally posted by Nick Cox View Post
            Data example please

            I converted observations into weekly time series data as follows.

            . gen ddate = daily(date, "MDY")

            . gen wdate = wofd(ddate)

            . format wdate %tw

            . collapse (sum) count, by(wdate)

            . list

            +-----------------+
            | wdate count |
            |-----------------|
            1. | 2015w1 2 |
            2. | 2015w2 1 |
            3. | 2015w3 1 |
            4. | 2015w5 1 |
            5. | 2015w6 1 |
            |-----------------|
            6. | 2015w7 3 |
            7. | 2015w8 1 |
            8. | 2015w9 1 |
            9. | 2015w10 3 |
            10. | 2015w13 1 |
            |-----------------|
            11. | 2015w14 1 |
            12. | 2015w15 1 |
            13. | 2015w18 3 |
            14. | 2015w19 3 |
            15. | 2015w22 2 |
            |-----------------|
            16. | 2015w24 3 |
            17. | 2015w26 1 |
            18. | 2015w28 2 |
            19. | 2015w29 1 |
            20. | 2015w33 1 |
            |-----------------|
            21. | 2015w34 1 |
            22. | 2015w36 2 |
            23. | 2015w38 2 |
            24. | 2015w40 1 |
            25. | 2015w41 1 |
            |-----------------|
            26. | 2015w44 3 |
            27. | 2015w46 1 |
            28. | 2015w48 1 |
            29. | 2015w49 3 |
            30. | 2015w50 2 |
            +-----------------+

            There are only 30 weeks in 2015. The weeks with zero counts did not appear in the data. There is no information on 4th, 11th, 12th, etc. I want to have 4th, 11th, 12th, and other zero count weeks to be shown in the dataset. Thank you for the help

            Comment


            • #7
              If your full dataset spans several years I would do something more like this. Each week is here indexed by the Sunday that starts it. This recipe can be modified to any other start day. See again references in #3.

              Code:
              gen ddate = daily(date, "MDY")
              gen wdate = ddate - dow(ddate) 
              collapse (sum) count, by(wdate) 
              tsset wdate, delta(7)
              tsfill
              Here's proof of concept:

              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input str9 date float count
              "5/25/2023"  42
              "7/1/2023" 999
              end
              
              gen ddate = daily(date, "MDY")
              gen wdate = ddate - dow(ddate)
              format ?date %td 
              
              collapse (sum) count, by(wdate)
              
              tsset wdate , delta(7)
              
              tsfill 
              
              replace count = 0 if count == .
              
              list 
              
                   +-------------------+
                   |     wdate   count |
                   |-------------------|
                1. | 21may2023      42 |
                2. | 28may2023       0 |
                3. | 04jun2023       0 |
                4. | 11jun2023       0 |
                5. | 18jun2023       0 |
                   |-------------------|
                6. | 25jun2023     999 |
                   +-------------------+
              .


              Comment


              • #8
                Nick,

                Thank you for the helpful information! As seen below, the first case occurred in the week of 2/25/2018. Thus, the dataset does not show the weeks with zero counts before that week, such as 1/1/2018, 1/8/2018, etc. Your suggestion displayed zero weeks only after the first case. Any advice would be appreciated.

                Code:
                * Example generated by -dataex-. For more info, type help dataex
                clear
                input str10 date byte count
                "2/25/2018" 1
                "3/11/2018" 1
                "3/19/2018" 1
                "3/21/2018" 1
                "3/22/2018" 1
                end
                Last edited by DY Kim; 02 Jul 2023, 10:44.

                Comment


                • #9
                  So, you have a different problem but related problem, wanting to extend the series backward. You need an extra observation to tell Stata where your data should start. Insert an observation

                  Code:
                  1/1/2018  0
                  and apply the same code. My code will tell you that that is in a week starting on 31 December 2017.

                  Comment

                  Working...
                  X