Hello,
Correction title: multiple OLS regression
I'm new to Stata and I am doing research for my master thesis in Economics on sickness absence and I have to check if there is causation/correlation between sickness absence and the independent variables from my dataset.
I have the following variables:
I cleaned extreme outliers, missing data, faulty data according to the requirements of my population
I transformed Sex to a dummy variable : Female =1 if female & =0 if male
I initially had observations for every sickness absence trajectory, I summed up these up per Employee
I created sector dummy variables for each sector
I created var Sickness days = end - begin date sickness absence
I created var Employment days = end - begin date employment
I created var Sickness percentage = (Sickness days / employment days)*100
Now i want to create time (Year) dummy variables for my regression to check the trend in sickness absence over time, I have the date periods in td format. What is the best way to create these time dummy variables?
//regression_command outcome_variable predictors i.year
The problem is I don't have a Year variable, only the period (beginning & end) date of the sickness absence period. Therefore I would have to generate a Year, but Sickness absence trajectories can span from days to multiple years.
For example if I had a Year variable, I could use
//gen Year_2016=1 if year==2016
//replace Year_2016=0 if year!=2016 & year!=.
Or
//
forval y = 2004/2020 {
gen year`y' = year == `y'
}
//i.year
//help fvvarlist
//help xi
Additional Information (Variable names in Dutch)
Correction title: multiple OLS regression
I'm new to Stata and I am doing research for my master thesis in Economics on sickness absence and I have to check if there is causation/correlation between sickness absence and the independent variables from my dataset.
I have the following variables:
- EmployeeID
- EmployerID
- Age
- Sex
- Sector
- Startdate Sickness absence
- Enddate Sickness absence
- Startdate employment
- Enddate employment
- Postcode
- Cause (Sickness/Pregnancy)
I cleaned extreme outliers, missing data, faulty data according to the requirements of my population
I transformed Sex to a dummy variable : Female =1 if female & =0 if male
I initially had observations for every sickness absence trajectory, I summed up these up per Employee
I created sector dummy variables for each sector
I created var Sickness days = end - begin date sickness absence
I created var Employment days = end - begin date employment
I created var Sickness percentage = (Sickness days / employment days)*100
Now i want to create time (Year) dummy variables for my regression to check the trend in sickness absence over time, I have the date periods in td format. What is the best way to create these time dummy variables?
//regression_command outcome_variable predictors i.year
The problem is I don't have a Year variable, only the period (beginning & end) date of the sickness absence period. Therefore I would have to generate a Year, but Sickness absence trajectories can span from days to multiple years.
For example if I had a Year variable, I could use
//gen Year_2016=1 if year==2016
//replace Year_2016=0 if year!=2016 & year!=.
Or
//
forval y = 2004/2020 {
gen year`y' = year == `y'
}
//i.year
//help fvvarlist
//help xi
Additional Information (Variable names in Dutch)
Code:
* Example generated by -dataex-. For more info, type help dataex clear input long WerknemerID int(WerkgeverID Leeftijd) float(Vrouw_n Sector_n Typeverzuim_n StartDA EindDA StartDV EindDV Arbeidsdagen VerzuimdagenPWN VerzuimpercentagePWN) 1199 449 34 1 6 1 18216 19677 19099 19100 1462 61 4.1723666 1208 639 34 1 16 1 17653 17987 17699 17702 335 52 15.522388 1209 639 38 1 16 1 18536 18839 18807 18813 304 14 4.605263 1210 639 66 1 16 1 15096 19175 17391 17713 4080 343 8.406863 1211 639 65 1 16 1 13221 19175 18689 18709 5955 96 1.6120907 1212 639 64 1 16 1 12539 19175 17357 17357 6637 18 .27120686 1213 639 31 1 16 1 18324 18778 18605 18608 455 4 .8791209 1214 639 60 1 16 1 15779 19175 18668 18671 3397 35 1.0303209 1215 639 60 0 16 1 11566 19175 18660 18671 7610 24 .3153745 1217 639 62 1 16 1 17188 19175 17986 17988 1988 20 1.0060362 1218 639 58 1 16 1 17426 19175 18430 18430 1750 23 1.3142858 1221 639 64 1 16 1 9497 19175 18233 18684 9679 780 8.058683 1226 639 61 1 16 1 11759 19175 18135 18137 7417 11 .14830793 1229 639 62 1 16 1 17776 19175 17947 17961 1400 48 3.4285715 1230 639 60 1 16 1 14294 19175 17713 17908 4882 225 4.608767 1231 639 63 1 16 1 17532 18658 17489 17492 1127 13 1.1535048 1234 639 65 1 16 1 11932 18778 17680 17684 6847 205 2.994012 1235 639 61 1 16 1 11748 19175 18331 18349 7428 67 .9019924 1236 639 63 0 16 1 18171 19175 18624 18639 1005 16 1.59204 1244 639 61 1 16 1 17136 19175 18297 18302 2040 201 9.852942 1246 639 63 0 16 1 15826 19175 17981 17982 3350 54 1.6119403 1247 639 64 1 16 1 15200 19175 17546 17547 3976 41 1.0311872 1249 639 65 1 16 1 17714 19175 18402 18406 1462 5 .3419973 1250 639 31 1 16 1 18184 18931 18674 18676 748 7 .9358289 1253 639 31 1 16 1 17790 19175 17245 17346 1386 121 8.730159 1255 639 31 1 16 1 17714 19175 18389 18391 1462 3 .20519836 1258 639 31 1 16 1 18625 18778 18920 18923 154 10 6.493506 1263 639 30 1 16 1 17410 18839 18273 18275 1430 29 2.027972 1268 639 63 1 16 1 18202 19175 18117 18119 974 14 1.4373716 1271 639 64 1 16 1 16437 17531 17498 17511 1095 14 1.278539 1272 639 62 1 16 1 15492 19175 17372 17377 3684 139 3.773073 1273 639 66 1 16 1 11231 19175 17598 17601 7945 350 4.4052863 1277 639 59 1 16 1 15356 19175 17532 17537 3820 14 .36649215 1278 639 66 0 16 1 15229 19175 18567 18570 3947 22 .5573854 1280 639 60 1 16 1 15188 19175 18010 18010 3988 8 .2006018 1281 639 58 1 16 1 10638 19175 17436 17436 8538 36 .4216444 1286 639 59 0 16 1 14610 17531 17492 17500 2922 9 .3080082 1287 639 66 0 16 1 12753 19175 18266 18275 6423 116 1.8060097 1288 639 60 1 16 1 -37 17531 17804 17804 17569 114 .6488702 1289 639 64 1 16 1 6241 19175 18325 18461 12935 144 1.1132586 1290 639 61 0 16 1 11443 19175 17470 17502 7733 146 1.8880124 1291 639 61 1 16 1 8095 19175 18364 18367 11081 188 1.6965978 1292 639 62 0 16 1 7761 19175 17934 17938 11415 5 .04380202 1299 639 64 1 16 1 9726 19175 18098 18104 9450 69 .7301587 1300 639 60 1 16 1 6241 19175 18707 18710 12935 22 .17008117 1301 639 60 1 16 1 9936 19175 18777 18791 9240 471 5.097403 1302 639 60 1 16 1 15828 18808 18525 18525 2981 9 .3019121 1303 639 58 1 16 1 9010 18508 18116 18122 9499 32 .33687755 1305 639 61 0 16 1 -553 17531 17416 17423 18085 116 .6414155 1307 639 66 1 16 1 10547 18778 17888 17888 8232 424 5.150632 1308 639 59 1 16 1 16618 19175 18073 18076 2558 816 31.89992 1310 639 60 0 16 1 15963 19175 18274 18274 3213 1 .03112356 1311 639 66 0 16 1 9770 19175 18877 18881 9406 19 .20199873 1312 639 58 1 16 1 16315 19175 17987 17989 2861 198 6.920657 1314 639 63 1 16 1 15361 19175 18611 18615 3815 250 6.55308 1315 639 65 1 16 1 14797 19175 17647 17650 4379 57 1.301667 1317 639 61 1 16 1 17470 19175 18947 19175 1706 252 14.771395 1318 639 61 0 16 1 8797 19175 17959 17971 10379 18 .1734271 1319 639 65 0 16 1 9405 19175 18241 18241 9771 11 .11257803 1321 639 56 1 16 1 12492 19175 18165 18181 6684 633 9.470377 1323 639 61 1 16 1 14762 19175 17506 17509 4414 13 .29451746 1324 639 59 1 16 1 12675 19175 18338 18339 6501 581 8.937087 1325 639 63 1 16 1 7640 19175 18294 18295 11536 36 .3120666 1326 639 63 0 16 1 10805 19175 18137 18143 8371 75 .8959503 1327 639 60 0 16 1 17045 18839 17857 17859 1795 132 7.35376 1330 639 31 0 16 1 18422 18605 18493 18499 184 7 3.804348 1333 639 31 1 16 1 18506 18870 18773 18776 365 10 2.739726 1334 639 31 1 16 1 18243 19175 18908 18909 933 6 .6430868 1335 639 31 1 16 1 18398 19175 18275 18286 778 17 2.18509 1338 639 30 1 16 1 18506 19175 17832 17833 670 58 8.656716 1344 639 60 1 16 1 11779 18140 17841 17847 6362 16 .25149325 1347 639 61 0 16 1 11779 19175 17555 17558 7397 4 .05407597 1348 639 60 0 16 1 6940 19175 17497 17502 12236 75 .6129454 1349 639 61 0 16 1 15857 19175 17345 17348 3319 8 .24103646 1352 639 58 1 16 1 10774 19175 17905 17908 8402 83 .98786 1365 639 59 0 16 1 14792 19175 18290 18293 4384 27 .6158759 1366 639 61 1 16 1 11289 19175 17839 17840 7887 23 .29161912 1367 639 57 1 16 1 18112 19175 18035 18041 1064 7 .6578947 1368 639 65 1 16 1 12419 19175 17946 17949 6757 254 3.759065 1369 639 66 1 16 1 12784 18109 17294 17298 5326 21 .39429215 1370 639 67 0 16 1 -2876 17531 17454 17474 20408 21 .10290083 1375 639 64 0 16 1 9617 18778 18217 18223 9162 109 1.1896966 1379 639 66 1 16 1 11443 19175 18665 18674 7733 10 .12931591 1380 639 62 1 16 1 18171 19175 17914 17924 1005 20 1.9900497 1383 639 60 0 16 1 17531 17532 17728 17728 2 1 50 1385 639 63 1 16 1 14823 18017 17384 17386 3195 72 2.2535212 1392 639 67 1 16 1 16223 19175 17125 17376 2953 252 8.533694 1395 639 66 1 16 1 15233 19175 18795 18797 3943 184 4.6664977 1397 639 64 1 16 1 6695 19175 17616 17618 12481 115 .9214005 1398 639 66 1 16 1 15165 19175 17762 17763 4011 24 .5983545 1399 639 64 1 16 1 8156 19175 18582 18584 11020 130 1.1796733 1400 639 30 1 16 1 18506 19175 18536 18548 670 38 5.671642 1401 639 59 1 16 1 14509 19175 18562 18671 4667 145 3.106921 1402 639 59 1 16 1 31 17531 17245 17245 17501 7 .03999772 1403 639 59 0 16 1 13009 17956 17440 17443 4948 60 1.2126112 1407 639 65 1 16 1 15706 19175 17450 17453 3470 166 4.783862 1410 639 66 1 16 1 11779 19175 18287 18321 7397 74 1.0004056 1418 639 64 1 16 1 6940 19175 18219 18220 12236 3 .024517816 1419 639 65 0 16 1 7532 19175 17276 17279 11644 686 5.891446 1422 639 57 1 16 1 17391 18121 17755 17755 731 1 .1367989 end format %tdNN/DD/CCYY StartDA format %tdNN/DD/CCYY EindDA format %tdNN/DD/CCYY StartDV format %tdNN/DD/CCYY EindDV label values Vrouw_n Vrouw_n_label label def Vrouw_n_label 0 "man", modify label def Vrouw_n_label 1 "vrouw", modify label values Sector_n Sector_n_label label def Sector_n_label 6 "Sector G Groot- en detailhandel", modify label def Sector_n_label 16 "Sector Q Gezondheids- en welzijnszorg", modify label values Typeverzuim_n Typeverzuim_n_label label def Typeverzuim_n_label 1 "Ziekte", modify
Comment