Dear listers
I have a data set where i have a list of dates indicating an event.
A mock dataset could look like this, var3 is just for explanation
I want to do more than one thing and think i need to do this stepwise.
1) To count new episodes. So from the start date (lowest date by id) a new episode is defined as there is more than 200 days between two dates:
2) I want to identify the first episode pr id
3) indicate that the first episode is not a returning event but the first one
4) generate a date that indicates the debut date
5) generate a date for each new episode
6) generate a date that indicates the end of an episode
But this is where i get lost.
So firstly, there must be a much smarter way of doing this?
And secondly, I really want to know how many days each episode lasts, and how many days there is between episodes ect.
Do i need to reshape or is it possible to get the same info in long format?
Thank you for reading all the way to the end.
Best,
Lars
I have a data set where i have a list of dates indicating an event.
A mock dataset could look like this, var3 is just for explanation
Code:
* Example generated by -dataex-. For more info, type help dataex clear input float(id date) str40 var3 1 19960 "This is the start date of episode 1 id 1" 1 20054 "This is the end date of episode 1 id 1" 1 20453 "This is the start date of episode 2 id 1" 1 20483 "" 1 20512 "" 1 20605 "" 1 20759 "" 1 21125 "" 1 21157 "" 2 18993 "" 2 19025 "" 2 19056 "" 2 19118 "" 2 19524 "" 2 19797 "" 2 20315 "" 2 20393 "" 2 20437 "" end format %td date
1) To count new episodes. So from the start date (lowest date by id) a new episode is defined as there is more than 200 days between two dates:
Code:
bysort id: gen new_episode=1 if date-date[_n-1]>200
Code:
bysort id: gen first_episode=1 if _n==1
Code:
replace new_episode=. if first_episode==1
Code:
gen debute_date=date if first_episode==1 format debute_date %td
Code:
generate new_episode_date=date if new_episode==1 formate new_episode_date %td
Code:
bysort id: generate end_episode_date=date[_n-1]+100 if new_episode==1 format end_episode_date %td
So firstly, there must be a much smarter way of doing this?
And secondly, I really want to know how many days each episode lasts, and how many days there is between episodes ect.
Do i need to reshape or is it possible to get the same info in long format?
Thank you for reading all the way to the end.
Best,
Lars

Comment