Hi,
I've been trying to merge two data sets for hours unsuccessfully. Here is a snapshot of the two data sets.
I've tried merge 1:1 id ymonth.
This is data1
And I'm trying to merge it with this data set.
In the real dataset I have more observations per individual, one each month between jan 2011 - feb 2018, but because of work history there are some "outliers". I have other data sets that are also going to me merged with these two. They also contain monthly information.
I think the problem is because that for some months I have more observations per month per individual. This is mainly because there has been a change in working hours during that same work period. I would like to create new variables, a dummy for whether there has been a change in working hours, one for the "new" number of working hours. But I'm not sure how to do this.
Can someone please guide me
Regards,
Saliha
I've been trying to merge two data sets for hours unsuccessfully. Here is a snapshot of the two data sets.
I've tried merge 1:1 id ymonth.
This is data1
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str28 diagnosis str24 municip_code float(ymonth born_year) long id float(benefit_startdate benefit_stopdate male) "disease in digestive system" "36" 634 1981 1 634 648 0 "disease in digestive system" "36" 638 1981 1 . . 0 "disease in digestive system" "36" 641 1981 1 . . 0 "disease in digestive system" "36" 642 1981 1 . . 0 "disease in digestive system" "36" 645 1981 1 . . 0 "disease in digestive system" "36" 646 1981 1 . . 0 "disease in digestive system" "36" 648 1981 1 . . 0 "disease in digestive system" "36" 679 1981 1 . . 0 "unknown" "22" 614 1969 2 . . 0 "unknown" "22" 620 1969 2 . . 0 "unknown" "22" 661 1969 2 661 687 0 "unknown" "22" 666 1969 2 . . 0 "unknown" "22" 687 1969 2 . . 0 "Mental disorders" "83" 598 1988 3 598 648 0 "Mental disorders" "83" 659 1988 3 659 662 0 "Mental disorders" "83" 672 1988 3 672 689 0 "Mental disorders" "83" 689 1988 3 . . 0 "Mental disorders" "83" 693 1988 3 693 693 0 "MSDs" "42" 596 1969 4 596 648 1 "MSDs" "42" 644 1969 4 . . 1 "disease in digestive system" "11" 526 1969 5 526 567 0 "disease in digestive system" "11" 577 1969 5 577 648 0 "unknown" "55" 624 1984 6 624 654 0 "unknown" "55" 627 1984 6 . . 0 "unknown" "55" 635 1984 6 . . 0 "unknown" "55" 667 1984 6 667 687 0 end format %tm ymonth format %ty born_year format %tm benefit_startdate format %tm benefit_stopdate
And I'm trying to merge it with this data set.
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float(id ymonth startwork stopwork occupation_code industry worktime) 1 579 579 . 3310 75 39.73 1 594 594 . 3310 85 39.73 1 594 594 633 3310 85 32.23 1 638 638 640 3310 85 7 1 638 638 . 3310 85 13.23 1 641 641 . 3310 85 2.23 1 642 642 . 3310 85 7.77 1 645 645 659 3310 85 32.23 1 645 645 . 3310 85 12.61 1 648 648 678 3310 85 32.23 1 648 648 . 3310 85 32.23 1 679 679 . 3310 85 32.23 2 314 314 . 3491 22 39.73 2 314 314 612 3491 58 9.73 2 314 314 . 3492 22 9.73 2 614 614 663 3492 58 9.73 2 614 614 . 3492 58 9.73 2 620 620 659 3492 58 9.73 2 620 620 . 3492 58 9.73 2 666 666 . 1310 82 39.73 2 687 687 . 3492 349 9.73 3 689 689 . 3423 342 74.170006 4 547 547 . 7237 727 39.73 4 644 644 . 7231 723 39.73 4 644 644 . 7232 721 39.73 5 . . . . . . 6 570 570 . 9132 91 39.73 6 580 580 . 5139 512 11.46 6 605 605 . 5140 515 8.8 6 611 611 . 5140 515 14.23 6 627 627 . 5140 515 11.74 6 635 635 . 5140 515 11.35 end format %tm ymonth format %tm startwork format %tm stopwork
I think the problem is because that for some months I have more observations per month per individual. This is mainly because there has been a change in working hours during that same work period. I would like to create new variables, a dummy for whether there has been a change in working hours, one for the "new" number of working hours. But I'm not sure how to do this.
Can someone please guide me
Regards,
Saliha
Comment