Dear Statalists,
I´m currently working on a dataset which is considered to be a quasi-panel dataset.
My interest within the dataset relies on individuals, identifier: "pid" and two other variables namely "vac" and "wave" with wave representing the time of observation for individual i in any wave, where the individual chose to take part in the interview.
What I would like to do is to identify the individuals within the dataset who changed their answer with regards to the "vac" variable over time.
For example individual 1 changed the answer to "vac" in wave 6. In wave 5 the answer was "n" in wave 6 it was "y". I would like to keep those two observations (from wave 5 and wave 6) in the dataset and remove the earlier ones, namely from wave 1 to 4.
For individual 2 it is a bit more complex as individual 2 changed his answers more frequently. However I would like to keep only the last observation where the individual changed his answer with regards to the "vac" variable.
Namely for this individual I would like to keep his observations in the dataset from wave 5 and wave 6 as in wave 5 he answered "n" and in wave 6 "m".
If there is an option to generate a new variable which allows me to keep only those last two observations where in one observation a change in "vac" happened I would highly appreciate your help in finding out how to code it correctly.
Thank you all very much in advance and I hope my explanation given above can help to find a solution to my problem.
Greetings
Carsten
I´m currently working on a dataset which is considered to be a quasi-panel dataset.
My interest within the dataset relies on individuals, identifier: "pid" and two other variables namely "vac" and "wave" with wave representing the time of observation for individual i in any wave, where the individual chose to take part in the interview.
What I would like to do is to identify the individuals within the dataset who changed their answer with regards to the "vac" variable over time.
For example individual 1 changed the answer to "vac" in wave 6. In wave 5 the answer was "n" in wave 6 it was "y". I would like to keep those two observations (from wave 5 and wave 6) in the dataset and remove the earlier ones, namely from wave 1 to 4.
For individual 2 it is a bit more complex as individual 2 changed his answers more frequently. However I would like to keep only the last observation where the individual changed his answer with regards to the "vac" variable.
Namely for this individual I would like to keep his observations in the dataset from wave 5 and wave 6 as in wave 5 he answered "n" and in wave 6 "m".
If there is an option to generate a new variable which allows me to keep only those last two observations where in one observation a change in "vac" happened I would highly appreciate your help in finding out how to code it correctly.
Thank you all very much in advance and I hope my explanation given above can help to find a solution to my problem.
Greetings
Carsten
Code:
* Example generated by -dataex-. For more info, type help dataex clear input byte pid str1 vac byte wave 1 "y" 1 1 "y" 2 1 "y" 3 1 "y" 4 1 "y" 5 1 "n" 6 2 "n" 1 2 "y" 3 2 "n" 5 2 "m" 6 2 "m" 7 3 "y" 4 3 "y" 5 3 "n" 6 3 "m" 7 4 "y" 5 4 "y" 6 4 "y" 8 4 "n" 9 5 "m" 1 5 "n" 3 5 "m" 5 6 "n" 2 6 "n" 3 6 "y" 4 7 "m" 1 8 "m" 2 9 "n" 3 10 "y" 4 11 "y" 1 12 "y" 1 13 "y" 1 13 "n" 2 13 "n" 6 13 "y" 7 14 "n" 3 14 "n" 5 14 "y" 6 14 "m" 8 15 "n" 1 16 "m" 2 17 "n" 3 18 "m" 4 19 "n" 6 20 "m" 9 end
Comment