Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating Time (year) dummy variables for a simple OLS regression

    Hello,

    Correction title: multiple OLS regression

    I'm new to Stata and I am doing research for my master thesis in Economics on sickness absence and I have to check if there is causation/correlation between sickness absence and the independent variables from my dataset.

    I have the following variables:
    • EmployeeID
    • EmployerID
    • Age
    • Sex
    • Sector
    • Startdate Sickness absence
    • Enddate Sickness absence
    • Startdate employment
    • Enddate employment
    • Postcode
    • Cause (Sickness/Pregnancy)
    I cleaned the data by postcode for the region that I want to research
    I cleaned extreme outliers, missing data, faulty data according to the requirements of my population
    I transformed Sex to a dummy variable : Female =1 if female & =0 if male
    I initially had observations for every sickness absence trajectory, I summed up these up per Employee
    I created sector dummy variables for each sector
    I created var Sickness days = end - begin date sickness absence
    I created var Employment days = end - begin date employment
    I created var Sickness percentage = (Sickness days / employment days)*100


    Now i want to create time (Year) dummy variables for my regression to check the trend in sickness absence over time, I have the date periods in td format. What is the best way to create these time dummy variables?

    //regression_command outcome_variable predictors i.year

    The problem is I don't have a Year variable, only the period (beginning & end) date of the sickness absence period. Therefore I would have to generate a Year, but Sickness absence trajectories can span from days to multiple years.

    For example if I had a Year variable, I could use
    //gen Year_2016=1 if year==2016
    //replace Year_2016=0 if year!=2016 & year!=.

    Or
    //
    forval y = 2004/2020 {
    gen year`y' = year == `y'
    }
    //i.year
    //help fvvarlist
    //help xi




    Additional Information (Variable names in Dutch)
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long WerknemerID int(WerkgeverID Leeftijd) float(Vrouw_n Sector_n Typeverzuim_n StartDA EindDA StartDV EindDV Arbeidsdagen VerzuimdagenPWN VerzuimpercentagePWN)
    1199 449 34 1  6 1 18216 19677 19099 19100  1462  61  4.1723666
    1208 639 34 1 16 1 17653 17987 17699 17702   335  52  15.522388
    1209 639 38 1 16 1 18536 18839 18807 18813   304  14   4.605263
    1210 639 66 1 16 1 15096 19175 17391 17713  4080 343   8.406863
    1211 639 65 1 16 1 13221 19175 18689 18709  5955  96  1.6120907
    1212 639 64 1 16 1 12539 19175 17357 17357  6637  18  .27120686
    1213 639 31 1 16 1 18324 18778 18605 18608   455   4   .8791209
    1214 639 60 1 16 1 15779 19175 18668 18671  3397  35  1.0303209
    1215 639 60 0 16 1 11566 19175 18660 18671  7610  24   .3153745
    1217 639 62 1 16 1 17188 19175 17986 17988  1988  20  1.0060362
    1218 639 58 1 16 1 17426 19175 18430 18430  1750  23  1.3142858
    1221 639 64 1 16 1  9497 19175 18233 18684  9679 780   8.058683
    1226 639 61 1 16 1 11759 19175 18135 18137  7417  11  .14830793
    1229 639 62 1 16 1 17776 19175 17947 17961  1400  48  3.4285715
    1230 639 60 1 16 1 14294 19175 17713 17908  4882 225   4.608767
    1231 639 63 1 16 1 17532 18658 17489 17492  1127  13  1.1535048
    1234 639 65 1 16 1 11932 18778 17680 17684  6847 205   2.994012
    1235 639 61 1 16 1 11748 19175 18331 18349  7428  67   .9019924
    1236 639 63 0 16 1 18171 19175 18624 18639  1005  16    1.59204
    1244 639 61 1 16 1 17136 19175 18297 18302  2040 201   9.852942
    1246 639 63 0 16 1 15826 19175 17981 17982  3350  54  1.6119403
    1247 639 64 1 16 1 15200 19175 17546 17547  3976  41  1.0311872
    1249 639 65 1 16 1 17714 19175 18402 18406  1462   5   .3419973
    1250 639 31 1 16 1 18184 18931 18674 18676   748   7   .9358289
    1253 639 31 1 16 1 17790 19175 17245 17346  1386 121   8.730159
    1255 639 31 1 16 1 17714 19175 18389 18391  1462   3  .20519836
    1258 639 31 1 16 1 18625 18778 18920 18923   154  10   6.493506
    1263 639 30 1 16 1 17410 18839 18273 18275  1430  29   2.027972
    1268 639 63 1 16 1 18202 19175 18117 18119   974  14  1.4373716
    1271 639 64 1 16 1 16437 17531 17498 17511  1095  14   1.278539
    1272 639 62 1 16 1 15492 19175 17372 17377  3684 139   3.773073
    1273 639 66 1 16 1 11231 19175 17598 17601  7945 350  4.4052863
    1277 639 59 1 16 1 15356 19175 17532 17537  3820  14  .36649215
    1278 639 66 0 16 1 15229 19175 18567 18570  3947  22   .5573854
    1280 639 60 1 16 1 15188 19175 18010 18010  3988   8   .2006018
    1281 639 58 1 16 1 10638 19175 17436 17436  8538  36   .4216444
    1286 639 59 0 16 1 14610 17531 17492 17500  2922   9   .3080082
    1287 639 66 0 16 1 12753 19175 18266 18275  6423 116  1.8060097
    1288 639 60 1 16 1   -37 17531 17804 17804 17569 114   .6488702
    1289 639 64 1 16 1  6241 19175 18325 18461 12935 144  1.1132586
    1290 639 61 0 16 1 11443 19175 17470 17502  7733 146  1.8880124
    1291 639 61 1 16 1  8095 19175 18364 18367 11081 188  1.6965978
    1292 639 62 0 16 1  7761 19175 17934 17938 11415   5  .04380202
    1299 639 64 1 16 1  9726 19175 18098 18104  9450  69   .7301587
    1300 639 60 1 16 1  6241 19175 18707 18710 12935  22  .17008117
    1301 639 60 1 16 1  9936 19175 18777 18791  9240 471   5.097403
    1302 639 60 1 16 1 15828 18808 18525 18525  2981   9   .3019121
    1303 639 58 1 16 1  9010 18508 18116 18122  9499  32  .33687755
    1305 639 61 0 16 1  -553 17531 17416 17423 18085 116   .6414155
    1307 639 66 1 16 1 10547 18778 17888 17888  8232 424   5.150632
    1308 639 59 1 16 1 16618 19175 18073 18076  2558 816   31.89992
    1310 639 60 0 16 1 15963 19175 18274 18274  3213   1  .03112356
    1311 639 66 0 16 1  9770 19175 18877 18881  9406  19  .20199873
    1312 639 58 1 16 1 16315 19175 17987 17989  2861 198   6.920657
    1314 639 63 1 16 1 15361 19175 18611 18615  3815 250    6.55308
    1315 639 65 1 16 1 14797 19175 17647 17650  4379  57   1.301667
    1317 639 61 1 16 1 17470 19175 18947 19175  1706 252  14.771395
    1318 639 61 0 16 1  8797 19175 17959 17971 10379  18   .1734271
    1319 639 65 0 16 1  9405 19175 18241 18241  9771  11  .11257803
    1321 639 56 1 16 1 12492 19175 18165 18181  6684 633   9.470377
    1323 639 61 1 16 1 14762 19175 17506 17509  4414  13  .29451746
    1324 639 59 1 16 1 12675 19175 18338 18339  6501 581   8.937087
    1325 639 63 1 16 1  7640 19175 18294 18295 11536  36   .3120666
    1326 639 63 0 16 1 10805 19175 18137 18143  8371  75   .8959503
    1327 639 60 0 16 1 17045 18839 17857 17859  1795 132    7.35376
    1330 639 31 0 16 1 18422 18605 18493 18499   184   7   3.804348
    1333 639 31 1 16 1 18506 18870 18773 18776   365  10   2.739726
    1334 639 31 1 16 1 18243 19175 18908 18909   933   6   .6430868
    1335 639 31 1 16 1 18398 19175 18275 18286   778  17    2.18509
    1338 639 30 1 16 1 18506 19175 17832 17833   670  58   8.656716
    1344 639 60 1 16 1 11779 18140 17841 17847  6362  16  .25149325
    1347 639 61 0 16 1 11779 19175 17555 17558  7397   4  .05407597
    1348 639 60 0 16 1  6940 19175 17497 17502 12236  75   .6129454
    1349 639 61 0 16 1 15857 19175 17345 17348  3319   8  .24103646
    1352 639 58 1 16 1 10774 19175 17905 17908  8402  83     .98786
    1365 639 59 0 16 1 14792 19175 18290 18293  4384  27   .6158759
    1366 639 61 1 16 1 11289 19175 17839 17840  7887  23  .29161912
    1367 639 57 1 16 1 18112 19175 18035 18041  1064   7   .6578947
    1368 639 65 1 16 1 12419 19175 17946 17949  6757 254   3.759065
    1369 639 66 1 16 1 12784 18109 17294 17298  5326  21  .39429215
    1370 639 67 0 16 1 -2876 17531 17454 17474 20408  21  .10290083
    1375 639 64 0 16 1  9617 18778 18217 18223  9162 109  1.1896966
    1379 639 66 1 16 1 11443 19175 18665 18674  7733  10  .12931591
    1380 639 62 1 16 1 18171 19175 17914 17924  1005  20  1.9900497
    1383 639 60 0 16 1 17531 17532 17728 17728     2   1         50
    1385 639 63 1 16 1 14823 18017 17384 17386  3195  72  2.2535212
    1392 639 67 1 16 1 16223 19175 17125 17376  2953 252   8.533694
    1395 639 66 1 16 1 15233 19175 18795 18797  3943 184  4.6664977
    1397 639 64 1 16 1  6695 19175 17616 17618 12481 115   .9214005
    1398 639 66 1 16 1 15165 19175 17762 17763  4011  24   .5983545
    1399 639 64 1 16 1  8156 19175 18582 18584 11020 130  1.1796733
    1400 639 30 1 16 1 18506 19175 18536 18548   670  38   5.671642
    1401 639 59 1 16 1 14509 19175 18562 18671  4667 145   3.106921
    1402 639 59 1 16 1    31 17531 17245 17245 17501   7  .03999772
    1403 639 59 0 16 1 13009 17956 17440 17443  4948  60  1.2126112
    1407 639 65 1 16 1 15706 19175 17450 17453  3470 166   4.783862
    1410 639 66 1 16 1 11779 19175 18287 18321  7397  74  1.0004056
    1418 639 64 1 16 1  6940 19175 18219 18220 12236   3 .024517816
    1419 639 65 0 16 1  7532 19175 17276 17279 11644 686   5.891446
    1422 639 57 1 16 1 17391 18121 17755 17755   731   1   .1367989
    end
    format %tdNN/DD/CCYY StartDA
    format %tdNN/DD/CCYY EindDA
    format %tdNN/DD/CCYY StartDV
    format %tdNN/DD/CCYY EindDV
    label values Vrouw_n Vrouw_n_label
    label def Vrouw_n_label 0 "man", modify
    label def Vrouw_n_label 1 "vrouw", modify
    label values Sector_n Sector_n_label
    label def Sector_n_label 6 "Sector G Groot- en detailhandel", modify
    label def Sector_n_label 16 "Sector Q Gezondheids- en welzijnszorg", modify
    label values Typeverzuim_n Typeverzuim_n_label
    label def Typeverzuim_n_label 1 "Ziekte", modify
    Last edited by Roy Claessen; 18 May 2021, 07:03.

  • #2
    perhaps you can create dummies and then use those in the regression?
    Code:
    quietly tab year, gen(yeardum) // it creates the year dummies
    
    * and then to run the regression with dummies:
    
    reg y x yeardum*

    Comment


    • #3
      Roy:
      welcome to this forum.
      Some asides to Tom's helpful guidance:
      - if the right-hand side of your regression equation includes more than one predictors, you're outside the simple regression (oftentimes misinterpreted as) comfort zone, as you have a multiple OLS;
      - your interest in creating -i.time- should not imply that you've panel data. If that were the case, you should consider -xtreg- if your regressand is continuous.
      Kind regards,
      Carlo
      (Stata 18.0 SE)

      Comment


      • #4
        Dear Carlo & Tom. Thank you for responding. Sorry if my post is unclear.

        The problem is that I do not have a panel data format and at the moment, I'm not able to format it the right way. I cannot format it from wide to long since I only have the beginning & end date of employment for each observation.

        For example
        1 observation: EmployeeID...

        Startdate employment
        11/15/2009

        Enddate employment
        11/15/2013

        Employment days (End-Begin date)
        1462

        Similarly, I have calculated the Sickness absence days per employee over these years
        = 61

        Therefore sickness absence rate is simply dividing
        sickness absence days/employment days
        =(61/1462) *100 = 4.17%

        Now I want to create a Year dummy variable for each year and add these to my regression in order to account for the time effects of sickness absence.

        So I would have to generate Year variables based on the employment beginning & end date

        Year 2009 = 1
        Year 2010 = 1
        Year 2011 = 1
        Year 2012 = 1
        Year 2013 = 1
        Other years (ranging from 1979 - 2020) = 0

        Should I create a dummy variable for every year or Is there a better way to do this?

        Comment


        • #5
          Roy:
          if your -id- are different each year, you have cross-sectional data; if your -id- are the same across years, you have panel data.
          Being clearer on this point would make replying easier. Thank.
          Kind regards,
          Carlo
          (Stata 18.0 SE)

          Comment

          Working...
          X