Hello everyone,
I am trying to generate a variable that meets a criteria by date. I would like to see how many number of job matches a person received in the last 14 or 30 days for each observation. The sample data is as follows:
For example, looking at row 5 of the dataset.
The start date tells us when the call is being made, recall_days tells us if we are asking question about 14 or 30 days ago, recall date tells us what the date was 14 or 30 days ago, job_matches tells us how many job matches were sent for that start date and applications_sent is telling us how many jobs did that person apply to.
Taking row 5 here as the example, we have a start date of 31st october 2017 with the recall date telling us that 14 days prior to this start date, the date would have been 17th october 2017. If we look at the rows before this, we find that row number 4 falls within 17th and 31st october as it has a date of 20th october. Which means that for row 5 the total number of matches will be 1 + 1 = 2 that we have sent in total within the last 14 days. The 14 or 30 days recall time period may span 0, 1 or 2 rows
This has to be done for each row by id_num. If nothing falls within the range, then the the value in the row would be the same as it is.
I hope I was clear in explaining. Please feel free to reply if there is clarification required.
I am trying to generate a variable that meets a criteria by date. I would like to see how many number of job matches a person received in the last 14 or 30 days for each observation. The sample data is as follows:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float row byte id_num int start_date byte recall_days int recall_date byte(job_matches applications_sent) 1 1 21048 14 21034 2 0 2 1 21076 14 21062 1 0 3 1 21105 14 21091 1 0 4 1 21112 14 21098 1 0 5 1 21123 14 21109 1 0 6 1 21298 14 21284 2 0 7 1 21375 14 21361 4 0 8 1 21399 14 21385 1 0 9 1 21413 14 21399 1 0 10 1 21435 14 21421 2 0 11 1 21451 14 21437 1 0 12 1 21515 14 21501 1 0 13 1 21544 14 21530 2 0 14 1 21606 14 21592 3 0 15 1 21621 14 21607 1 0 16 1 21641 14 21627 1 0 17 1 21661 14 21647 7 0 18 1 21679 14 21665 5 0 19 1 21692 14 21678 2 0 20 1 21725 14 21711 3 0 21 1 21739 14 21725 1 0 22 1 21753 14 21739 1 0 23 1 21789 14 21775 1 0 24 1 21844 14 21830 3 0 25 1 21866 14 21852 3 0 26 1 21876 14 21862 4 0 27 1 21889 14 21875 4 0 28 1 21927 14 21913 2 0 29 2 21048 30 21018 1 0 30 2 21102 30 21072 1 0 31 2 21298 30 21268 1 0 32 2 21312 30 21282 1 0 33 2 21375 30 21345 1 0 34 2 21413 30 21383 1 0 35 2 21425 30 21395 1 0 36 2 21435 30 21405 1 0 37 2 21462 30 21432 1 0 38 2 21479 30 21449 1 0 39 2 21530 30 21500 3 0 40 2 21544 30 21514 2 0 41 2 21560 30 21530 1 0 42 2 21566 30 21536 1 0 43 2 21593 30 21563 1 0 44 2 21621 30 21591 3 0 45 2 21632 30 21602 1 0 46 2 21649 30 21619 2 0 47 2 21661 30 21631 1 0 48 2 21679 30 21649 2 0 49 2 21692 30 21662 3 0 50 2 21811 30 21781 2 0 51 2 21844 30 21814 2 0 52 2 21876 30 21846 1 0 53 2 21889 30 21859 2 0 54 2 21907 30 21877 1 0 55 2 21927 30 21897 2 0 56 2 21953 30 21923 1 0 57 2 22167 30 22137 2 0 58 2 22197 30 22167 1 0 59 2 22294 30 22264 1 0 60 2 22330 30 22300 1 0 61 2 22424 30 22394 2 0 62 2 22487 30 22457 1 0 63 2 22518 30 22488 1 0 64 2 22629 30 22599 1 0 65 3 21048 30 21018 8 0 66 3 21076 30 21046 4 1 67 3 21083 30 21053 1 0 68 3 21088 30 21058 2 0 69 3 21094 30 21064 1 0 70 3 21098 30 21068 1 0 71 3 21102 30 21072 1 0 72 3 21104 30 21074 2 0 73 3 21105 30 21075 2 1 74 3 21110 30 21080 1 0 75 3 21112 30 21082 2 0 76 3 21113 30 21083 1 0 77 3 21116 30 21086 1 0 78 3 21118 30 21088 1 0 79 3 21123 30 21093 1 0 80 3 21278 30 21248 2 0 81 3 21298 30 21268 2 0 82 3 21312 30 21282 3 0 83 3 21375 30 21345 21 0 84 3 21399 30 21369 1 0 85 3 21413 30 21383 4 0 86 3 21425 30 21395 3 0 87 3 21435 30 21405 17 0 88 3 21451 30 21421 4 0 89 3 21462 30 21432 5 0 90 3 21469 30 21439 3 0 91 3 21479 30 21449 7 0 92 3 21496 30 21466 11 0 93 3 21503 30 21473 5 0 94 3 21515 30 21485 16 0 95 3 21530 30 21500 30 0 96 3 21544 30 21514 27 0 97 3 21553 30 21523 13 0 98 3 21560 30 21530 9 0 99 3 21566 30 21536 16 0 100 3 21574 30 21544 15 0 end format %td start_date format %td recall_date
For example, looking at row 5 of the dataset.
The start date tells us when the call is being made, recall_days tells us if we are asking question about 14 or 30 days ago, recall date tells us what the date was 14 or 30 days ago, job_matches tells us how many job matches were sent for that start date and applications_sent is telling us how many jobs did that person apply to.
Taking row 5 here as the example, we have a start date of 31st october 2017 with the recall date telling us that 14 days prior to this start date, the date would have been 17th october 2017. If we look at the rows before this, we find that row number 4 falls within 17th and 31st october as it has a date of 20th october. Which means that for row 5 the total number of matches will be 1 + 1 = 2 that we have sent in total within the last 14 days. The 14 or 30 days recall time period may span 0, 1 or 2 rows
This has to be done for each row by id_num. If nothing falls within the range, then the the value in the row would be the same as it is.
I hope I was clear in explaining. Please feel free to reply if there is clarification required.

Comment