Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Weekly average data

    I need to get a weekly average (mean) of daily census data. The command week(daily_date) generates the number of weeks in a year that does not match the actual weeks. The command dow(daily_date) creates the actual weekdays. I need the daily census data averaged for Sunday-Saturday.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(num_id daily_date dow) int census
    1 21915 3 52
    1 21916 4 51
    1 21917 5 50
    1 21918 6 49
    1 21919 0 49
    1 21920 1 48
    1 21921 2 50
    1 21922 3 50
    1 21923 4 51
    1 21924 5 50
    1 21925 6 51
    1 21926 0 50
    1 21927 1 51
    1 21928 2 52
    1 21929 3 50
    1 21930 4 49
    1 21931 5 53
    1 21932 6 52
    1 21933 0 52
    1 21934 1 53
    1 21935 2 53
    1 21936 3 53
    1 21937 4 53
    1 21938 5 53
    1 21939 6 53
    1 21940 0 53
    1 21941 1 52
    1 21942 2 52
    1 21943 3 51
    1 21944 4 51
    1 21945 5 50
    1 21946 6 49
    1 21947 0 49
    1 21948 1 50
    1 21949 2 51
    1 21950 3 51
    1 21951 4 51
    1 21952 5 51
    1 21953 6 50
    1 21954 0 50
    1 21955 1 52
    1 21956 2 54
    1 21957 3 54
    1 21958 4 54
    1 21959 5 54
    1 21960 6 52
    1 21961 0 52
    1 21962 1 53
    1 21963 2 53
    1 21964 3 51
    1 21965 4 51
    1 21966 5 51
    1 21967 6 50
    1 21968 0 50
    1 21969 1 48
    1 21970 2 49
    1 21971 3 49
    1 21972 4 50
    1 21973 5 51
    1 21974 6 51
    1 21975 0 51
    1 21976 1 53
    1 21977 2 53
    1 21978 3 53
    1 21979 4 53
    1 21980 5 52
    1 21981 6 51
    1 21982 0 51
    1 21983 1 52
    1 21984 2 52
    1 21985 3 51
    1 21986 4 50
    1 21987 5 52
    1 21988 6 51
    1 21989 0 50
    1 21990 1 51
    1 21991 2 50
    1 21992 3 50
    1 21993 4 48
    1 21994 5 49
    1 21995 6 49
    1 21996 0 49
    1 21997 1 50
    1 21998 2 50
    1 21999 3 50
    1 22000 4 50
    1 22001 5 52
    1 22002 6 52
    1 22003 0 52
    1 22004 1 53
    1 22005 2 54
    1 22006 3 54
    1 22007 4 51
    1 22008 5 50
    1 22009 6 49
    1 22010 0 49
    1 22011 1 49
    1 22012 2 49
    1 22013 3 49
    1 22014 4 47
    end
    format %td daily_date
    label values dow dow
    label def dow 0 "Sunday", modify
    label def dow 1 "Monday", modify
    label def dow 2 "Tuesday", modify
    label def dow 3 "Wednesday", modify
    label def dow 4 "Thursday", modify
    label def dow 5 "Friday", modify
    label def dow 6 "Saturday", modify
    Last edited by gani davlatyov; 26 Jun 2023, 11:51.

  • #2
    See https://journals.sagepub.com/doi/ful...6867X221083928 for techniques on defining weeks. Then it is a matter of collapsing the data.

    Code:
    collapse census, by(week)

    Comment


    • #3
      I expect there is a function that allows one to generate the desired week numbers in one line, but this two-line solution does the trick and is fairly transparent (I think):

      Code:
      * Generate week numbers with Sunday as 1st day of week
      generate week = 1 in 1
      replace week = week[_n-1] + (dow==0) if _n > 1
      Q1. What do you want to do for weeks with less than 7 days?

      Q2. Do you want to add the weekly means to the current dataset, or do you want to replace the working dataset with one that has just the weekly means? Andrew's -collapse- will do the latter, if I am not mistaken. To do the former, you could use -bysort- and -egen-.

      Some rough & ready code for the -collapse- method:

      Code:
      . * [1] Use collapse if you don't want to keep the original data
      . collapse (count) days=dow (mean) wkmean1 = census, by(week)
      
      . generate wkmean2 = wkmean1 if days==7
      (2 missing values generated)
      
      . list, clean noobs
      
          week   days   wkmean1    wkmean2  
             1      4      50.5          .  
             2      7   49.8571   49.85714  
             3      7        51         51  
             4      7   52.8571   52.85714  
             5      7   51.1429   51.14286  
             6      7   50.4286   50.42857  
             7      7   52.8571   52.85714  
             8      7   51.5714   51.57143  
             9      7   49.7143   49.71429  
            10      7   52.2857   52.28571  
            11      7   51.2857   51.28571  
            12      7   49.5714   49.57143  
            13      7   50.4286   50.42857  
            14      7   51.8571   51.85714  
            15      5      48.6          .
      And for the -egen- approach:

      Code:
      . * [2] Use -egen- if you want to keep the original data
      . bysort week: egen wkmean1 = mean(census)
      
      . * Exclude weeks with days < 7
      . by week: egen wkmean2 = mean(census) if _N == 7
      (9 missing values generated)
      
      . list in 1/18, sepby(week)
      
           +----------------------------------------------------------------------+
           | num_id   daily_d~e         dow   census   week    wkmean1    wkmean2 |
           |----------------------------------------------------------------------|
        1. |      1   01jan2020   Wednesday       52      1       50.5          . |
        2. |      1   02jan2020    Thursday       51      1       50.5          . |
        3. |      1   03jan2020      Friday       50      1       50.5          . |
        4. |      1   04jan2020    Saturday       49      1       50.5          . |
           |----------------------------------------------------------------------|
        5. |      1   05jan2020      Sunday       49      2   49.85714   49.85714 |
        6. |      1   06jan2020      Monday       48      2   49.85714   49.85714 |
        7. |      1   07jan2020     Tuesday       50      2   49.85714   49.85714 |
        8. |      1   08jan2020   Wednesday       50      2   49.85714   49.85714 |
        9. |      1   09jan2020    Thursday       51      2   49.85714   49.85714 |
       10. |      1   10jan2020      Friday       50      2   49.85714   49.85714 |
       11. |      1   11jan2020    Saturday       51      2   49.85714   49.85714 |
           |----------------------------------------------------------------------|
       12. |      1   12jan2020      Sunday       50      3         51         51 |
       13. |      1   13jan2020      Monday       51      3         51         51 |
       14. |      1   14jan2020     Tuesday       52      3         51         51 |
       15. |      1   15jan2020   Wednesday       50      3         51         51 |
       16. |      1   16jan2020    Thursday       49      3         51         51 |
       17. |      1   17jan2020      Friday       53      3         51         51 |
       18. |      1   18jan2020    Saturday       52      3         51         51 |
           +----------------------------------------------------------------------+
      HTH.
      --
      Bruce Weaver
      Email: [email protected]
      Version: Stata/MP 18.5 (Windows)

      Comment


      • #4

        As your weeks start on Sunday, the simplest technique I know is to index weeks by the Sundays that start them. The 2010 reference in the paper linked by Andrew Musau says more, but the device explains itself. @Bruce Weaver's technique is equivalent.



        Code:
        gen week = daily_date - dow 
        
        egen wanted = mean(census), by(num_id week)
        
        list , sepby(week) 
        
             +------------------------------------------------------------+
             | num_id   daily_d~e         dow   census    week     wanted |
             |------------------------------------------------------------|
          1. |      1   01jan2020   Wednesday       52   21912       50.5 |
          2. |      1   02jan2020    Thursday       51   21912       50.5 |
          3. |      1   03jan2020      Friday       50   21912       50.5 |
          4. |      1   04jan2020    Saturday       49   21912       50.5 |
             |------------------------------------------------------------|
          5. |      1   05jan2020      Sunday       49   21919   49.85714 |
          6. |      1   06jan2020      Monday       48   21919   49.85714 |
          7. |      1   07jan2020     Tuesday       50   21919   49.85714 |
          8. |      1   08jan2020   Wednesday       50   21919   49.85714 |
          9. |      1   09jan2020    Thursday       51   21919   49.85714 |
         10. |      1   10jan2020      Friday       50   21919   49.85714 |
         11. |      1   11jan2020    Saturday       51   21919   49.85714 |
             |------------------------------------------------------------|
         12. |      1   12jan2020      Sunday       50   21926         51 |
         13. |      1   13jan2020      Monday       51   21926         51 |
         14. |      1   14jan2020     Tuesday       52   21926         51 |
         15. |      1   15jan2020   Wednesday       50   21926         51 |
         16. |      1   16jan2020    Thursday       49   21926         51 |
         17. |      1   17jan2020      Friday       53   21926         51 |
         18. |      1   18jan2020    Saturday       52   21926         51 |
             |------------------------------------------------------------|
         19. |      1   19jan2020      Sunday       52   21933   52.85714 |
         20. |      1   20jan2020      Monday       53   21933   52.85714 |
         21. |      1   21jan2020     Tuesday       53   21933   52.85714 |
         22. |      1   22jan2020   Wednesday       53   21933   52.85714 |
         23. |      1   23jan2020    Thursday       53   21933   52.85714 |
         24. |      1   24jan2020      Friday       53   21933   52.85714 |
         25. |      1   25jan2020    Saturday       53   21933   52.85714 |
             |------------------------------------------------------------|
         26. |      1   26jan2020      Sunday       53   21940   51.14286 |
         27. |      1   27jan2020      Monday       52   21940   51.14286 |
         28. |      1   28jan2020     Tuesday       52   21940   51.14286 |
         29. |      1   29jan2020   Wednesday       51   21940   51.14286 |
         30. |      1   30jan2020    Thursday       51   21940   51.14286 |
         31. |      1   31jan2020      Friday       50   21940   51.14286 |
         32. |      1   01feb2020    Saturday       49   21940   51.14286 |
             |------------------------------------------------------------|
         33. |      1   02feb2020      Sunday       49   21947   50.42857 |
         34. |      1   03feb2020      Monday       50   21947   50.42857 |
         35. |      1   04feb2020     Tuesday       51   21947   50.42857 |
         36. |      1   05feb2020   Wednesday       51   21947   50.42857 |
         37. |      1   06feb2020    Thursday       51   21947   50.42857 |
         38. |      1   07feb2020      Friday       51   21947   50.42857 |
         39. |      1   08feb2020    Saturday       50   21947   50.42857 |
             |------------------------------------------------------------|
         40. |      1   09feb2020      Sunday       50   21954   52.85714 |
         41. |      1   10feb2020      Monday       52   21954   52.85714 |
         42. |      1   11feb2020     Tuesday       54   21954   52.85714 |
         43. |      1   12feb2020   Wednesday       54   21954   52.85714 |
         44. |      1   13feb2020    Thursday       54   21954   52.85714 |
         45. |      1   14feb2020      Friday       54   21954   52.85714 |
         46. |      1   15feb2020    Saturday       52   21954   52.85714 |
             |------------------------------------------------------------|
         47. |      1   16feb2020      Sunday       52   21961   51.57143 |
         48. |      1   17feb2020      Monday       53   21961   51.57143 |
         49. |      1   18feb2020     Tuesday       53   21961   51.57143 |
         50. |      1   19feb2020   Wednesday       51   21961   51.57143 |
         51. |      1   20feb2020    Thursday       51   21961   51.57143 |
         52. |      1   21feb2020      Friday       51   21961   51.57143 |
         53. |      1   22feb2020    Saturday       50   21961   51.57143 |
             |------------------------------------------------------------|
         54. |      1   23feb2020      Sunday       50   21968   49.71429 |
         55. |      1   24feb2020      Monday       48   21968   49.71429 |
         56. |      1   25feb2020     Tuesday       49   21968   49.71429 |
         57. |      1   26feb2020   Wednesday       49   21968   49.71429 |
         58. |      1   27feb2020    Thursday       50   21968   49.71429 |
         59. |      1   28feb2020      Friday       51   21968   49.71429 |
         60. |      1   29feb2020    Saturday       51   21968   49.71429 |
             |------------------------------------------------------------|
         61. |      1   01mar2020      Sunday       51   21975   52.28571 |
         62. |      1   02mar2020      Monday       53   21975   52.28571 |
         63. |      1   03mar2020     Tuesday       53   21975   52.28571 |
         64. |      1   04mar2020   Wednesday       53   21975   52.28571 |
         65. |      1   05mar2020    Thursday       53   21975   52.28571 |
         66. |      1   06mar2020      Friday       52   21975   52.28571 |
         67. |      1   07mar2020    Saturday       51   21975   52.28571 |
             |------------------------------------------------------------|
         68. |      1   08mar2020      Sunday       51   21982   51.28571 |
         69. |      1   09mar2020      Monday       52   21982   51.28571 |
         70. |      1   10mar2020     Tuesday       52   21982   51.28571 |
         71. |      1   11mar2020   Wednesday       51   21982   51.28571 |
         72. |      1   12mar2020    Thursday       50   21982   51.28571 |
         73. |      1   13mar2020      Friday       52   21982   51.28571 |
         74. |      1   14mar2020    Saturday       51   21982   51.28571 |
             |------------------------------------------------------------|
         75. |      1   15mar2020      Sunday       50   21989   49.57143 |
         76. |      1   16mar2020      Monday       51   21989   49.57143 |
         77. |      1   17mar2020     Tuesday       50   21989   49.57143 |
         78. |      1   18mar2020   Wednesday       50   21989   49.57143 |
         79. |      1   19mar2020    Thursday       48   21989   49.57143 |
         80. |      1   20mar2020      Friday       49   21989   49.57143 |
         81. |      1   21mar2020    Saturday       49   21989   49.57143 |
             |------------------------------------------------------------|
         82. |      1   22mar2020      Sunday       49   21996   50.42857 |
         83. |      1   23mar2020      Monday       50   21996   50.42857 |
         84. |      1   24mar2020     Tuesday       50   21996   50.42857 |
         85. |      1   25mar2020   Wednesday       50   21996   50.42857 |
         86. |      1   26mar2020    Thursday       50   21996   50.42857 |
         87. |      1   27mar2020      Friday       52   21996   50.42857 |
         88. |      1   28mar2020    Saturday       52   21996   50.42857 |
             |------------------------------------------------------------|
         89. |      1   29mar2020      Sunday       52   22003   51.85714 |
         90. |      1   30mar2020      Monday       53   22003   51.85714 |
         91. |      1   31mar2020     Tuesday       54   22003   51.85714 |
         92. |      1   01apr2020   Wednesday       54   22003   51.85714 |
         93. |      1   02apr2020    Thursday       51   22003   51.85714 |
         94. |      1   03apr2020      Friday       50   22003   51.85714 |
         95. |      1   04apr2020    Saturday       49   22003   51.85714 |
             |------------------------------------------------------------|
         96. |      1   05apr2020      Sunday       49   22010       48.6 |
         97. |      1   06apr2020      Monday       49   22010       48.6 |
         98. |      1   07apr2020     Tuesday       49   22010       48.6 |
         99. |      1   08apr2020   Wednesday       49   22010       48.6 |
        100. |      1   09apr2020    Thursday       47   22010       48.6 |
             +------------------------------------------------------------+

        Comment


        • #5
          Both options (Nick and Bruce) worked. Appreciate the help!

          Comment

          Working...
          X