Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Converting daily data into weekly

    Hi,

    I have daily production data from each employee in a factory and the production date that corresponds to it.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(pdt_day pdt_day_w target_pieces pieces_produced) str3 factory long empcode int production_date
          .375      .375  400 150 "B11" 20076668 22204
           .28       .28  500 140 "B11" 20076668 22205
          .575      .575  800 460 "B11" 20076668 22207
        .65625    .65625  800 525 "B11" 20076668 22208
           .65       .65  800 520 "B11" 20076668 22209
      .6333333  .6333333  600 380 "B11" 20076668 22210
           .69       .69  500 345 "B11" 20076668 22211
            .4        .4  100  40 "B11" 20076668 22212
        .66875    .66875  800 535 "B11" 20076668 22215
         .4125     .4125  800 330 "B11" 20076668 22216
          .175      .175  400  70 "B11" 20076668 22217
          .475      .475  400 190 "B11" 20076668 22218
            .5        .5  100  50 "B11" 20076668 22219
            .2        .2  100  20 "B11" 20076668 22221
            .1        .1  100  10 "B11" 20076668 22222
     .27522936 .27522936  109  30 "B11" 20076668 22223
      .5504587  .5504587  218 120 "B11" 20076668 22224
      .5504587  .5504587  109  60 "B11" 20076668 22225
      .6422018  .6422018  218 140 "B11" 20076668 22226
       .412844   .412844  218  90 "B11" 20076668 22228
      .4770642  .4770642  545 260 "B11" 20076668 22229
      .9174312  .9174312  109 100 "B11" 20076668 22230
      .9174312  .9174312  109 100 "B11" 20076668 22231
      .9174312  .9174312  109 100 "B11" 20076668 22232
      .9174312  .9174312  109 100 "B11" 20076668 22236
      .6422018  .6422018  218 140 "B11" 20076668 22237
      .5963303  .5963303  436 260 "B11" 20076668 22238
      .5504587  .5504587  545 300 "B11" 20076668 22239
      .7033639  .7033639  327 230 "B11" 20076668 22240
      .3669725  .3669725  436 160 "B11" 20076668 22242
     .27522936 .27522936  436 120 "B11" 20076668 22243
      .3669725  .3669725  218  80 "B11" 20076668 22244
      .5412844  .5412844  545 295 "B11" 20076668 22245
      .4931193  .4931193  436 215 "B11" 20076668 22246
      .4587156  .4587156  109  50 "B11" 20076668 22247
      .6020642  .6020642  872 525 "B11" 20076668 22249
      .5504587  .5504587  218 120 "B11" 20076668 22250
      .5810397  .5810397  327 190 "B11" 20076668 22251
      .5275229  .5275229  218 115 "B11" 20076668 22252
      .5412844  .5412844  545 295 "B11" 20076668 22253
      .6422018  .6422018  436 280 "B11" 20076668 22254
     .42201835 .42201835  545 230 "B11" 20076668 22256
     .45298165 .45298165  872 395 "B11" 20076668 22257
     .53899086 .53899086  436 235 "B11" 20076668 22258
      .4587156  .4587156  436 200 "B11" 20076668 22259
      .4587156  .4587156  654 300 "B11" 20076668 22260
      .6788991  .6788991  545 370 "B11" 20076668 22261
      .5963303  .5963303  327 195 "B11" 20076668 22263
      .4587156  .4587156  218 100 "B11" 20076668 22264
      .5963303  .5963303  327 195 "B11" 20076668 22265
      .4587156  .4587156  327 150 "B11" 20076668 22266
      .7110091  .7110091  872 620 "B11" 20076668 22267
       .733945   .733945  872 640 "B11" 20076668 22268
      .5810397  .5810397  327 190 "B11" 20076668 22270
      .3784404  .3784404  436 165 "B11" 20076668 22271
      .3944954  .3944954  545 215 "B11" 20076668 22272
      .4587156  .4587156  109  50 "B11" 20076668 22273
      .4587156  .4587156  109  50 "B11" 20076668 22274
      .5504587  .5504587  109  60 "B11" 20076668 22275
      .5504587  .5504587  109  60 "B11" 20076668 22279
      .4587156  .4587156  109  50 "B11" 20076668 22280
      .4587156  .4587156  109  50 "B11" 20076668 22281
      .5504587  .5504587  109  60 "B11" 20076668 22282
     .27522936 .27522936  109  30 "B11" 20076668 22284
      .3669725  .3669725  109  40 "B11" 20076668 22285
     .27522936 .27522936  109  30 "B11" 20076668 22286
      .4587156  .4587156  218 100 "B11" 20076668 22287
      .6116208  .6116208  327 200 "B11" 20076668 22288
      .5045872  .5045872  109  55 "B11" 20076668 22289
      .5045872  .5045872  218 110 "B11" 20076668 22291
      .3669725  .3669725  109  40 "B11" 20076668 22292
     .38990825 .38990825  218  85 "B11" 20076668 22293
      .6422018  .6422018  109  70 "B11" 20076668 22295
      .4587156  .4587156  109  50 "B11" 20076668 22296
      .5504587  .5504587  109  60 "B11" 20076668 22298
       .619266   .619266  436 270 "B11" 20076668 22299
      .5504587  .5504587  109  60 "B11" 20076668 22301
      .5810397  .5810397  327 190 "B11" 20076668 22302
      .6422018  .6422018  109  70 "B11" 20076668 22303
      .4587156  .4587156  218 100 "B11" 20076668 22305
    .008354219 .06666667 1197  10 "B11" 20114027 22547
     .07309941 .07309941 1368 100 "B11" 20114027 22548
            .3        .3  100  30 "B11" 20115755 22223
            .6        .6  200 120 "B11" 20115755 22224
            .6        .6  100  60 "B11" 20115755 22225
            .7        .7  200 140 "B11" 20115755 22226
           .45       .45  200  90 "B11" 20115755 22228
           .52       .52  500 260 "B11" 20115755 22229
             1         1  100 100 "B11" 20115755 22230
             1         1  100 100 "B11" 20115755 22231
             1         1  100 100 "B11" 20115755 22232
             1         1  100 100 "B11" 20115755 22236
            .7        .7  200 140 "B11" 20115755 22237
           .65       .65  400 260 "B11" 20115755 22238
            .6        .6  500 300 "B11" 20115755 22239
      .7666667  .7666667  300 230 "B11" 20115755 22240
            .4        .4  400 160 "B11" 20115755 22242
            .3        .3  400 120 "B11" 20115755 22243
            .4        .4  200  80 "B11" 20115755 22244
           .59       .59  500 295 "B11" 20115755 22245
    end
    format %td production_date
    From this data, I want to generate weekly, monthly and bi-weekly estimates of what I'm seeing to use them in regressions after. How do i do this? One way I'm doing this is:


    Code:
     g week = week(production_date)
     g month = month(production_date)
     g year = year(production_date)
     g biweekly = week/2
    
    collapse pdt_day pdt_day_w target_pieces pieces_produced (first) factory , by(empcode week year month biweekly)
    Is collapsing the variable the best way to convert this data?

  • #2
    Hello Brendan Clark. What kind of regression model do you want to use? I ask, because I wonder if you could estimate a multilevel model (via -mixed-) with all of the data points (i.e., no need to use -collapse- first) with your various temporal variables as explanatory variables. You could then use -margins- to get the means you want, I think. Apologies if I've misunderstood what you want to do.
    --
    Bruce Weaver
    Email: [email protected]
    Version: Stata/MP 19.5 (Windows)

    Comment


    • #3
      Bruce Weaver This is helpful, thanks! I'm running a FE regression model with time and employee Fixed effects! I'm running the reghdfe command for this

      Comment


      • #4
        Note that the week() function does not do what you think it does.

        For each year, week 1 begins on January 1, regardless of what day of the week that is, and week 52 has 8 or 9 days, depending on whether or not it is a leap year.

        Comment


        • #5
          In support of #4, it's elementary -- but also fundamental -- that weeks don't nest within months unless you start numbering on the first day of each month, and then you're saddled with many partial weeks 1, 2 or 3 days long.

          That can be done -- see

          https://journals.sagepub.com/doi/pdf...6867X221106438

          https://journals.sagepub.com/doi/pdf...6867X221083928

          -- but it might be interesting to look at day of the week effects. In your example data, you have every day but Sunday represented.

          Code:
          . gen dow = dow(production_date)
          
          . tab dow
          
                  dow |      Freq.     Percent        Cum.
          ------------+-----------------------------------
                    1 |         14       14.00       14.00
                    2 |         16       16.00       30.00
                    3 |         17       17.00       47.00
                    4 |         17       17.00       64.00
                    5 |         19       19.00       83.00
                    6 |         17       17.00      100.00
          ------------+-----------------------------------
                Total |        100      100.00

          Comment


          • #6
            Nick Cox William Lisowski thanks a lot, very helpful suggestions!

            I collapsed my data the way I showed above and then ran a a fixed effects regression (with month, week and biweekly fixed effects) and got pretty much identical results. I wasn't sure if this was because I did not aggregate the data correctly or if it's to be expected given my dataset

            Comment

            Working...
            X