Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Working with dates (year - week)

    Hello,

    I have a variable in my dataset that expresses the year and week in the following way: year - week. For example: 2020 - 01 2020-02 2020-03 and so on. This variable is in the string format. I would like to transform this values in time values.
    Can you help me with the code?
    Thank you in advance

  • #2
    Wow, that is really difficult. The "week of a year" is not well defined.

    In 2020 January 1 was on Wednesday. What is the first week of 2020?
    • Wednesday January 1 through Tuesday January 7
    • Sunday January 5 through Saturday January 11
    • Sunday December 29 through Saturday January 4
    • Something else
    Without understanding what your data represent, it's difficult to give concrete advice directed to your situation.

    Stata's idea of the week of the year is the first option above, and the final week of 2020 will, to Stata, be the 52nd week with 9 days, since 2020 had 366 days = 52 weeks + 2 days. But I have seen all three options used in different circumstances.



    Comment


    • #3
      The data are Covid cases and deaths that are given every week. For sure there is a way to tell stats that the values are for week of a year

      Comment


      • #4
        Your response does not address my question. Let me ask it a different way.

        In your data, what dates are included in "2020 - 01"? What dates are included in the last week of 2020? What dates are included in "2021 - 01"?

        Comment


        • #5
          In the dataset I have one variable called “year - week” and each observation is (for example for the first 3 month of last year) “2020 - 1; 2020-2; 2020-3. Then you have other columns with data for each of this weeks in the year 2020. For example: number of deaths or cases of Covid 19

          Comment


          • #6
            It is only one value per week...

            Comment


            • #7
              .
              Last edited by Ana Vasconcelos; 23 Jan 2021, 19:05.

              Comment


              • #8
                Hi,Ana.Maybe this is what you want.Please use the commond --numdate--(SSC)
                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input str7 date
                "1991-01"
                "1992-02"
                "2020-01"
                "2020-02"
                "2020-03"
                "2020-04"
                "2021-09"
                "2021-18"
                end
                
                . list
                
                     +---------+
                     |    date |
                     |---------|
                  1. | 1991-01 |
                  2. | 1992-02 |
                  3. | 2020-01 |
                  4. | 2020-02 |
                  5. | 2020-03 |
                     |---------|
                  6. | 2020-04 |
                  7. | 2021-09 |
                  8. | 2021-18 |
                     +---------+
                . numdate w week=date,pattern(YW)
                
                . list
                
                     +-------------------+
                     |    date      week |
                     |-------------------|
                  1. | 1991-01    1991w1 |
                  2. | 1992-02    1992w2 |
                  3. | 2020-01    2020w1 |
                  4. | 2020-02    2020w2 |
                  5. | 2020-03    2020w3 |
                     |-------------------|
                  6. | 2020-04    2020w4 |
                  7. | 2021-09    2021w9 |
                  8. | 2021-18   2021w18 |
                     +-------------------+
                I don't know whether this is what you want.
                Best regards.

                Raymond Zhang
                Stata 17.0,MP

                Comment


                • #9
                  Short answer: #3 explains that these are Covid cases and so at most span 2020 and 2021. If the data DON'T include any week 53, the solution in #8 will work fine, as will the code below.

                  Long answer: #3 didn't answer @William Lisowski's question, as he pointed out. Neither did #5 or #6.

                  @Raymond Zhang's helpful answer necessarily side-stepped the issue. Raymond did not have the information asked in #2 either.

                  There are three kinds of weeks:

                  1. Stata's definition, which is documented. I have never encountered examples from data providers using this definition. Note that if you know that there are any data for week 53 in any year, then your data are NOT following Stata's definition.

                  2. Any definition of the form "the week ends on XXXday" or equivalently "the week begins on YYYday" or equivalently "the week is centred on ZZZday" (which I can't recall used in practice but is logically possible if we assume that all weeks are 7 days long). Thus if weeks end on Friday, then then they start on Saturday. Under this (kind of) definition all weeks are 7 days long, but there is still a question of how weeks are labelled, given that with this definition some weeks will always start in one year and end in another.

                  3. Any other definition, especially but not only any definitions that allow partial weeks 1 2 3 4 5 or 6 days long.

                  numdate is (here) just a convenience wrapper for Stata's weekly() function and applying a default %tw format -- and it can't cope with any weeks 53, any more than can the direct code below.

                  Code:
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input str7 date
                  "1991-01"
                  "1992-02"
                  "2019-53"
                  "2020-01"
                  "2020-02"
                  "2020-03"
                  "2020-04"
                  "2021-09"
                  "2021-18"
                  end
                  
                  gen wanted = weekly(date, "YW")
                  format wanted %tw
                  
                  list
                  
                       +-------------------+
                       |    date    wanted |
                       |-------------------|
                    1. | 1991-01    1991w1 |
                    2. | 1992-02    1992w2 |
                    3. | 2019-53         . |
                    4. | 2020-01    2020w1 |
                    5. | 2020-02    2020w2 |
                       |-------------------|
                    6. | 2020-03    2020w3 |
                    7. | 2020-04    2020w4 |
                    8. | 2021-09    2021w9 |
                    9. | 2021-18   2021w18 |
                       +-------------------+
                  The discussions below say more --

                  Code:
                  . search week, sj
                  
                  Search of official help files, FAQs, Examples, and Stata Journals
                  
                  SJ-19-3 dm0100  . . . . . . . . . .  Speaking Stata: The last day of the month
                          . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
                          Q3/19   SJ 19(3):719--728                                (no commands)
                          discusses three related problems about getting the last day
                          of the month in a new variable
                  
                  SJ-12-4 dm0065_1  . . . . . Stata tip 111: More on working with weeks, erratum
                          . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
                          Q4/12   SJ 12(4):765                                     (no commands)
                          lists previously omitted key reference
                  
                  SJ-12-3 dm0065  . . . . . . . . . .  Stata tip 111: More on working with weeks
                          . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
                          Q3/12   SJ 12(3):565--569                                (no commands)
                          discusses how to convert data presented in yearly and weekly
                          form to daily dates and how to aggregate such data to months
                          or longer intervals
                  
                  SJ-10-4 dm0052  . . . . . . . . . . . . . . . . Stata tip 68: Week assumptions
                          . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
                          Q4/10   SJ 10(4):682--685                                (no commands)
                          tip on Stata's solution for weeks and on how to set up
                          your own alternatives given different definitions of the
                          week
                  -- but certainly don't exhaust the subject, as for example they say nothing about ISO weeks

                  https://en.wikipedia.org/wiki/ISO_8601#Week_dates

                  https://www.statalist.org/forums/for...ata-code-below

                  or epiweeks.

                  Last edited by Nick Cox; 24 Jan 2021, 02:57.

                  Comment


                  • #10
                    @Ana Vasconcelos You should show your example data and show us what the wanted data is like.
                    Best regards.

                    Raymond Zhang
                    Stata 17.0,MP

                    Comment


                    • #11
                      Thank you very much for your help!

                      Comment


                      • #12
                        @Ana Vasconcelos Have you solved your problem?
                        Best regards.

                        Raymond Zhang
                        Stata 17.0,MP

                        Comment


                        • #13
                          Hello,

                          Yes, I did solved the problem. I used the command:
                          gen wanted = weekly(date, "YW") format wanted %tw The data on week 53 is not being taken into account.
                          Thank you very much for all your help!

                          Comment


                          • #14
                            If you have any week 53 and are ignoring it, then that's a poor solution. A better one is

                            Code:
                            egen numweek = group(date), label
                            so long as you have data for all weeks since the first week in your dataset.

                            Comment


                            • #15
                              Originally posted by Nick Cox View Post
                              If you have any week 53 and are ignoring it, then that's a poor solution. A better one is

                              Code:
                              egen numweek = group(date), label
                              so long as you have data for all weeks since the first week in your dataset.
                              Dear Nick Cox,

                              I have used your tips regarding this issue as I am also dealing with a week 53 in my panel dataset. Unfortunately, when using code:
                              Code:
                               
                               egen numweek = group(date), label
                              as suggested by you, my week number (2020, 53) ends up at end of the complete period (Week 8, Year 2022) per country.

                              Example:
                              2022, 5
                              2022, 6
                              2022, 7
                              2022, 8
                              2020, 53
                              Would you say this is an issue? Does STATA read the date differently due to a change in order? Or does this not matter when eventually setting my panel and time variable?

                              Excuse me for any mistakes in my message, I am still learning my way around this portal!

                              Comment

                              Working...
                              X