Hi,
I have a dataset of individuals completing a series of tasks over time. The columns of interest are user_id, task_id, task_start_date, and task_completion_date
The panel is set using user_id
In each observation, I am trying to find the closest task_completion_date in the panel prior to the task_start_date.
The observations are sorted by user_id and task_start_date
The closest question I encountered on Statalist is https://www.statalist.org/forums/for...-date-variable
but in that question, one of the dates is fixed in a panel. In my case, both the dates are varying.
Any inputs are highly appreciated. Thank you in advance.
I have a dataset of individuals completing a series of tasks over time. The columns of interest are user_id, task_id, task_start_date, and task_completion_date
The panel is set using user_id
In each observation, I am trying to find the closest task_completion_date in the panel prior to the task_start_date.
The observations are sorted by user_id and task_start_date
The closest question I encountered on Statalist is https://www.statalist.org/forums/for...-date-variable
but in that question, one of the dates is fixed in a panel. In my case, both the dates are varying.
Code:
input long(user_id task_id) float(task_completion_date task_start_date) 1 145600 19145 19117 1 248937 19376 19369 1 285722 19444 19402 1 423274 19596 19472 1 385689 19542 19507 1 524506 19691 19628 1 537442 19701 19642 1 594828 19774 19749 1 841061 19911 19876 2 14951 18639 18545 2 12787 18619 18574 2 15126 18642 18589 2 15647 18648 18646 2 22675 18705 18686 2 477632 19657 19630 2 528701 19694 19642 end format %td task_completion_date format %td task_start_date
Comment