Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Identify weekdays for the high and low points in time series analysis

    Dear all,

    I have a question about how to identify the weekday of the high and low points in time series analysis.

    My data looks like this, where odate is the date and daily average is the variable of interest. I generated the weekday of the data by use of the dow command in stata.


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(odate daily_average dow)
    21833  558.8615 5
    21834 602.16187 6
    21835  599.1369 0
    21836 586.83997 1
    21837   487.384 2
    21840 588.78766 5
    21841 583.47736 6
    21842  598.8483 0
    21843  569.3498 1
    21847  546.3105 5
    21848 546.90594 6
    21849  592.6006 0
    21853  702.3023 4
    21854   570.108 5
    21855  587.7522 6
    21856 579.67017 0
    21857 576.75354 1
    21858  504.4952 2
    21859  564.3304 3
    21860  722.5038 4
    21861 541.64734 5
    21862 538.60876 6
    21864  630.0263 1
    21865 496.88385 2
    21866  659.5681 3
    21867  718.9317 4
    21868  597.2248 5
    21869  617.5574 6
    21870  608.1977 0
    21871  573.3346 1
    21872  513.1751 2
    21873  579.5194 3
    21874  731.8616 4
    21875  591.5007 5
    21876 585.56494 6
    21877 600.34674 0
    21878  571.5742 1
    21879  478.3791 2
    21880  570.5222 3
    21881  719.6547 4
    21882 573.55774 5
    21883  559.4268 6
    21884  571.9596 0
    21885  609.8727 1
    21886    514.05 2
    21887  573.0786 3
    21888  727.6231 4
    21889  568.8781 5
    21890  598.8592 6
    21891 604.44745 0
    21892  589.0873 1
    21893  484.4622 2
    21894  593.1388 3
    21895  750.2917 4
    21896 597.18036 5
    21897  594.0722 6
    21898  612.2352 0
    21901 593.78156 3
    21902  731.9205 4
    21903  587.6961 5
    21904  585.5294 6
    21905  592.0468 0
    21906  597.9922 1
    21907  457.2679 2
    21908 586.17267 3
    21909  754.5541 4
    21910  602.9098 5
    21911  594.4632 6
    21912  596.5747 0
    21913 606.11554 1
    21914 540.07556 2
    21915  611.2679 3
    21916  716.9801 4
    21917  571.1439 5
    21918 558.21387 6
    21919  588.6963 0
    21920  577.4242 1
    21921  496.9474 2
    21922 558.11566 3
    21923  750.2371 4
    21924  562.8652 5
    21925  594.6396 6
    21926 604.78296 0
    21927  596.4515 1
    21929  606.9549 3
    21930  738.0359 4
    21931 600.69617 5
    21932  570.9493 6
    21933 561.29755 0
    21934 554.81696 1
    21935  493.2901 2
    21936 518.47144 3
    21937  639.7033 4
    21938 559.89703 5
    21939  595.8712 6
    21942   605.671 2
    21943   614.989 3
    21944  704.0444 4
    21945    512.14 5
    21946  415.8226 6
    end
    format %tdCCYY-NN-DD odate
    The code that I used to plot the time series trend is
    Code:
    twoway (tsline daily_average)
    And the result I get is as below:
    Click image for larger version

Name:	Graph.png
Views:	1
Size:	128.3 KB
ID:	1609837


    I want to know the days for the high and low points in the picture, say whether it is Thursday, Sunday etc. I wonder if anyone knows how to do this Stata. Currently what I can do is to mark the weekend on the graph but that's not what we want exactly because it turns out that the weekend does not correspond to the high and low points most of the time. Therefore, I wonder if you have any idea about how to do this in Stata.

    Thank you so much!

  • #2
    Employing the method described here (https://journals.sagepub.com/doi/abs...867X1101000409) to identify weeks starting on Sunday (0) and ending on Saturday (6) per the dow() function:

    Code:
    gen week = odate - mod(dow(odate) - 7, 7)
    format week %tdd_m
    
    bys week: egen high = max(daily_average)
    bys week: egen low = min(daily_average)
    bys week: egen highday = max(cond(daily_average==high,dow,.))
    bys week: egen lowday = max(cond(daily_average==low,dow,.))

    Comment


    • #3
      Originally posted by Ali Atia View Post
      Employing the method described here (https://journals.sagepub.com/doi/abs...867X1101000409) to identify weeks starting on Sunday (0) and ending on Saturday (6) per the dow() function:

      Code:
      gen week = odate - mod(dow(odate) - 7, 7)
      format week %tdd_m
      
      bys week: egen high = max(daily_average)
      bys week: egen low = min(daily_average)
      bys week: egen highday = max(cond(daily_average==high,dow,.))
      bys week: egen lowday = max(cond(daily_average==low,dow,.))
      Hi Ali,

      Thank you so much for your reply. I think it is a very good way to get started. However, I'm still a little bit confused about how to combine the generated high and low with the time series plot that I show above.

      Now I know the weekday when the highest and lowest daily average happens within that week, but I don't know how to combine these with my time series of daily_average and odate and then mark the high and low points on the figure. I wonder if you could elaborate more on this.

      Thank you very much!

      Comment


      • #4
        Code:
        gen week = odate - mod(dow(odate) - 7, 7)
        format week %tdd_m
        label define DOW 0 "Sun" 1 "Mon" 2 "Tue" 3 "Wed" 4 "Thu" 5 "Fri" 6 "Sat"
        
        bys week: egen high = max(daily_average)
        bys week: egen low = min(daily_average)
        bys week: egen highdate = max(cond(daily_average==high,odate,.))
        bys week: egen highday = max(cond(daily_average==high,dow,.))
        bys week: egen lowdate = max(cond(daily_average==low,odate ,.))
        bys week: egen lowday = max(cond(daily_average==low,dow ,.))
        label values *day DOW
        twoway line daily_average odate || scatter high highdate,mlabel(highday) mlabpos(12)||scatter low lowdate,mlabel(lowday) mlabpos(6)||,legend(off) scheme(s1mono)
        Click image for larger version

Name:	Graph.png
Views:	1
Size:	137.6 KB
ID:	1609954

        Last edited by Ali Atia; 16 May 2021, 13:03.

        Comment


        • #5
          Originally posted by Ali Atia View Post
          Code:
          gen week = odate - mod(dow(odate) - 7, 7)
          format week %tdd_m
          label define DOW 0 "Sun" 1 "Mon" 2 "Tue" 3 "Wed" 4 "Thu" 5 "Fri" 6 "Sat"
          
          bys week: egen high = max(daily_average)
          bys week: egen low = min(daily_average)
          bys week: egen highdate = max(cond(daily_average==high,odate,.))
          bys week: egen highday = max(cond(daily_average==high,dow,.))
          bys week: egen lowdate = max(cond(daily_average==low,odate ,.))
          bys week: egen lowday = max(cond(daily_average==low,dow ,.))
          label values *day DOW
          twoway line daily_average odate || scatter high highdate,mlabel(highday) mlabpos(12)||scatter low lowdate,mlabel(lowday) mlabpos(6)||,legend(off) scheme(s1mono)
          [ATTACH=CONFIG]n1609954[/ATTACH]
          Hi Ali,

          The code works perfectly well on my side. Thank you so much for your help and explanation. That means a lot to me!

          Comment


          • #6
            Originally posted by Ali Atia View Post
            Code:
            gen week = odate - mod(dow(odate) - 7, 7)
            format week %tdd_m
            label define DOW 0 "Sun" 1 "Mon" 2 "Tue" 3 "Wed" 4 "Thu" 5 "Fri" 6 "Sat"
            
            bys week: egen high = max(daily_average)
            bys week: egen low = min(daily_average)
            bys week: egen highdate = max(cond(daily_average==high,odate,.))
            bys week: egen highday = max(cond(daily_average==high,dow,.))
            bys week: egen lowdate = max(cond(daily_average==low,odate ,.))
            bys week: egen lowday = max(cond(daily_daily_aggregate==low,dow ,.))
            label values *day DOW
            twoway line daily_average odate || scatter high highdate,mlabel(highday) mlabpos(12)||scatter low lowdate,mlabel(lowday) mlabpos(6)||,legend(off) scheme(s1mono)
            [ATTACH=CONFIG]n1609954[/ATTACH]
            Hi Ali,

            I have an additional question with respect to this time series analysis. Now I have another variable daily_aggregate, and I want to put these two lines together with each one high and low point identified. In addition, I want daily_aggregate to have another y-axis.

            The data is as below:

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input float(odate daily_average dow daily_aggregate)
            21833  558.8615 5 228433.4
            21834 602.16187 6 248830.3
            21835  599.1369 0 286873.3
            21836 586.83997 1 288542.2
            21837   487.384 2 396684.7
            21840 588.78766 5 251097.6
            21841 583.47736 6 264955.8
            21842  598.8483 0 280131.6
            21843  569.3498 1 292125.6
            21847  546.3105 5 322303.1
            21848 546.90594 6 284829.7
            21849  592.6006 0 312704.8
            21853  702.3023 4 565021.4
            21854   570.108 5   281427
            21855  587.7522 6 292463.9
            21856 579.67017 0   285388
            21857 576.75354 1 294538.6
            21858  504.4952 2 354774.8
            21859  564.3304 3 244320.1
            21860  722.5038 4 742229.9
            21861 541.64734 5   240482
            21862 538.60876 6 309879.1
            21864  630.0263 1 341794.4
            21865 496.88385 2 271955.2
            21866  659.5681 3 224037.5
            21867  718.9317 4 551383.1
            21868  597.2248 5 230995.3
            21869  617.5574 6 297261.5
            21870  608.1977 0 337398.2
            21871  573.3346 1 471880.2
            21872  513.1751 2 413158.5
            21873  579.5194 3   296647
            21874  731.8616 4   841382
            21875  591.5007 5   251893
            21876 585.56494 6 283856.1
            21877 600.34674 0 261585.1
            21878  571.5742 1 298434.8
            21879  478.3791 2 462645.1
            21880  570.5222 3 320055.3
            21881  719.6547 4 846669.6
            21882 573.55774 5 321442.8
            21883  559.4268 6 306342.7
            21884  571.9596 0 378978.4
            21885  609.8727 1 330358.8
            21886    514.05 2 410256.6
            21887  573.0786 3 325302.1
            21888  727.6231 4 965301.9
            21889  568.8781 5 316153.5
            21890  598.8592 6 326133.6
            21891 604.44745 0 371609.3
            21892  589.0873 1 425666.1
            21893  484.4622 2 612013.8
            21894  593.1388 3 395843.8
            21895  750.2917 4  1170309
            21896 597.18036 5 382975.6
            21897  594.0722 6 336415.6
            21898  612.2352 0 442558.3
            21901 593.78156 3 356800.1
            21902  731.9205 4 967405.1
            21903  587.6961 5 269775.3
            21904  585.5294 6   265819
            21905  592.0468 0   293197
            21906  597.9922 1 292557.6
            21907  457.2679 2 288545.6
            21908 586.17267 3 211657.5
            21909  754.5541 4 787074.3
            21910  602.9098 5 315600.2
            21911  594.4632 6 251086.5
            21912  596.5747 0 316811.3
            21913 606.11554 1 353601.5
            21914 540.07556 2 453573.3
            21915  611.2679 3   315442
            21916  716.9801 4  1161746
            21917  571.1439 5 383375.1
            21918 558.21387 6 493767.7
            21919  588.6963 0 568176.2
            21920  577.4242 1 610477.4
            21921  496.9474 2 670616.1
            21922 558.11566 3   476779
            21923  750.2371 4  1372432
            21924  562.8652 5 421542.8
            21925  594.6396 6 388884.8
            21926 604.78296 0 429245.3
            21927  596.4515 1 462107.4
            21929  606.9549 3 435527.3
            21930  738.0359 4  1283441
            21931 600.69617 5 434392.3
            21932  570.9493 6 417687.4
            21933 561.29755 0 455406.7
            21934 554.81696 1 578193.9
            21935  493.2901 2 954382.8
            21936 518.47144 3  1323431
            21937  639.7033 4  2410524
            21938 559.89703 5 655217.1
            21939  595.8712 6 507413.9
            21942   605.671 2 775126.8
            21943   614.989 3 909818.9
            21944  704.0444 4  1231788
            21945    512.14 5  1171733
            21946  415.8226 6  1432087
            end
            format %tdCCYY-NN-DD odate

            I used the code below following the previous part you wrote:
            Code:
            bys week: egen high1 = max(daily_aggregate)
            bys week: egen low1 = min(daily_aggregate)
            bys week: egen highdate1 = max(cond(daily_aggregate==high,odate,.))
            bys week: egen highday1 = max(cond(daily_aggregate==high,dow,.))
            bys week: egen lowdate1 = max(cond(daily_aggregate==low,odate ,.))
            bys week: egen lowday1 = max(cond(daily_aggregate==low,dow ,.))
            label values *day DOW
            
            twoway (tsline daily_daily_aggregate)(tsline daily_aggregate, yaxis(2)) || scatter high highdate,mlabel(highday) mlabpos(12)||scatter low lowdate,mlabel(lowday) mlabpos(6)||scatter high1 highdate1,mlabel(highday1) mlabpos(12)||scatter low1 lowdate1,mlabel(lowday1) mlabpos(6)||,leg(on) xtitle("date") ylab(, angle(h))
            However, the result I get is not the one I want. I wonder if you have any idea how to revise the code to get the expected result.

            Thank you very much and look forward to your reply.

            Comment

            Working...
            X