Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How do i create twice weekly data from daily data?

    Hi,

    I have daily level production data and the date each item was produced.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int production_date float(pdt_day week year month)
    22204       .375 42 2020 10
    22205        .28 42 2020 10
    22207       .575 42 2020 10
    22208     .65625 42 2020 10
    22209        .65 43 2020 10
    22210   .6333333 43 2020 10
    22211        .69 43 2020 10
    22212         .4 43 2020 10
    22215     .66875 43 2020 10
    22216      .4125 44 2020 10
    22217       .175 44 2020 10
    22218       .475 44 2020 10
    22219         .5 44 2020 10
    22221         .2 44 2020 11
    22222         .1 44 2020 11
    22223  .27522936 45 2020 11
    22224   .5504587 45 2020 11
    22225   .5504587 45 2020 11
    22226   .6422018 45 2020 11
    22228    .412844 45 2020 11
    22229   .4770642 45 2020 11
    22230   .9174312 46 2020 11
    22231   .9174312 46 2020 11
    22232   .9174312 46 2020 11
    22236   .9174312 46 2020 11
    22237   .6422018 47 2020 11
    22238   .5963303 47 2020 11
    22239   .5504587 47 2020 11
    22240   .7033639 47 2020 11
    22242   .3669725 47 2020 11
    22243  .27522936 47 2020 11
    22244   .3669725 48 2020 11
    22245   .5412844 48 2020 11
    22246   .4931193 48 2020 11
    22247   .4587156 48 2020 11
    22249   .6020642 48 2020 11
    22250   .5504587 48 2020 12
    22251   .5810397 49 2020 12
    22252   .5275229 49 2020 12
    22253   .5412844 49 2020 12
    22254   .6422018 49 2020 12
    22256  .42201835 49 2020 12
    22257  .45298165 49 2020 12
    22258  .53899086 50 2020 12
    22259   .4587156 50 2020 12
    22260   .4587156 50 2020 12
    22261   .6788991 50 2020 12
    22263   .5963303 50 2020 12
    22264   .4587156 50 2020 12
    22265   .5963303 51 2020 12
    22266   .4587156 51 2020 12
    22267   .7110091 51 2020 12
    22268    .733945 51 2020 12
    22270   .5810397 51 2020 12
    22271   .3784404 51 2020 12
    22272   .3944954 52 2020 12
    22273   .4587156 52 2020 12
    22274   .4587156 52 2020 12
    22275   .5504587 52 2020 12
    22279   .5504587 52 2020 12
    22280   .4587156 52 2020 12
    22281   .4587156  1 2021  1
    22282   .5504587  1 2021  1
    22284  .27522936  1 2021  1
    22285   .3669725  1 2021  1
    22286  .27522936  1 2021  1
    22287   .4587156  1 2021  1
    22288   .6116208  2 2021  1
    22289   .5045872  2 2021  1
    22291   .5045872  2 2021  1
    22292   .3669725  2 2021  1
    22293  .38990825  2 2021  1
    22295   .6422018  3 2021  1
    22296   .4587156  3 2021  1
    22298   .5504587  3 2021  1
    22299    .619266  3 2021  1
    22301   .5504587  3 2021  1
    22302   .5810397  4 2021  1
    22303   .6422018  4 2021  1
    22305   .4587156  4 2021  1
    22547 .008354219 39 2021  9
    22548  .07309941 39 2021  9
    22223         .3 45 2020 11
    22224         .6 45 2020 11
    22225         .6 45 2020 11
    22226         .7 45 2020 11
    22228        .45 45 2020 11
    22229        .52 45 2020 11
    22230          1 46 2020 11
    22231          1 46 2020 11
    22232          1 46 2020 11
    22236          1 46 2020 11
    22237         .7 47 2020 11
    22238        .65 47 2020 11
    22239         .6 47 2020 11
    22240   .7666667 47 2020 11
    22242         .4 47 2020 11
    22243         .3 47 2020 11
    22244         .4 48 2020 11
    22245        .59 48 2020 11
    end
    format %td production_date

    To further aggregate this data into weekly and monthly (from daily), I created monthly and weekly data variables

    Code:
     g week = week(production_date)
     g month = month(production_date)
    But I don't know how I can aggregate this to twice weekly...can someone help me with this?

  • #2
    Hi Brendan, can you clarify what you mean by "twice weekly"? For example, you want a variable that indicates which observations fall between certain days of the week?

    Comment


    • #3
      In your previous thread https://www.statalist.org/forums/for...ta-into-weekly you received and then decided to ignore advice that Stata function week() is at best an awkward device. Your project, your choice.

      Now in wanting "twice weekly" the question can be reversed, what is your definition of twice weekly, as a week contains 7 days and quite what makes sense for your project needs to be explicit?

      In your example data Sunday does not occur, so one aggregation might be Monday-Wednesday and Thursday-Saturday.

      Code:
      . gen dow = dow(production_date)
      
      . 
      . tab dow
      
              dow |      Freq.     Percent        Cum.
      ------------+-----------------------------------
                1 |         14       14.00       14.00
                2 |         16       16.00       30.00
                3 |         17       17.00       47.00
                4 |         17       17.00       64.00
                5 |         19       19.00       83.00
                6 |         17       17.00      100.00
      ------------+-----------------------------------
            Total |        100      100.00
      
      gen twiceweekly = dow > 3
      Note that this bears no relation whatsoever to what week() does, as part weeks defined this way can cross Stata weeks and indeed calendar years too.

      However, in the last post of the previous thread you indicated that you've already looked somehow at biweekly effects, so what did you do then and what's the difference in this thread?

      Comment


      • #4
        Hi Nick!

        I did take your advice on the week() function and found that technically, given the dates I have, it would be fine to retain. But that did help me for another project of mine where I realized quickly that aggregating as week() would not be helpful.

        So in the past, I just divided the week variable/2 which makes no sense to me at all. So basically, I didn't really have an answer from what I did.

        For the aggregation, yes, I was thinking every Monday and Wednesday! So now, instead of daily data, data split between every monday and wed of a week is reasonable enough.

        Comment


        • #5
          So now, instead of daily data, data split between every monday and wed of a week is reasonable enough.
          Brendan, the language you are using is still ambiguous. There are several ways to split a week by Monday and Wednesday, depending on whether or not the split day is inclusive or exclusive. Frankly (having read through the other thread) it does not look to your reader like you are being very thoughtful, either with your posts and or with your code. Along with providing a data example, demonstrating that you are thoughtful, and that you are doing your due diligence yourself is one way to get good answers. As on any forum, you should lurk a bit, read other posts and threads, and try to get a sense of the culture. It might benefit you to take a little more time with your writing.

          Fortunately Nick Cox has already provided you with everything you need for a solution. I will post one possible implementation for any future readers of this thread.

          Code:
          gen dow = dow(production_date)
          count if dow == 7
          assert r(N) == 0
          gen wanted = .
          replace wanted = 1 if inlist(dow, 1, 2, 3)
          replace wanted = 2 if inlist(dow, 4, 5, 6)
          Of course, you could do this in fewer lines with recode, but I prefer not to overwrite dow.

          Comment

          Working...
          X