This is what my spell data set looks like:
This is for one individual (2077314). betnr_siab is the company ID; betnr_dau is the spell status for that company (if it is the first or the last spell in this company, or perhaps one of the middle spells, or maybe just a one-off spell for a company. One-off spells would be employment spells where the individual works for the company only for one spell (typically a year as per the dataset, but can be more or less). To create the variable "time_until_last_ue" I used the following code:
What I now need to do is a little complicated for me. I need to generate a variable called "duration_of_reemployment" that holds a value for each Unemp Insurance spell (erw==11). I need to pick the maximum value from the "time_until_last_ue" variable, which is right before it resets to zero, and place it in the "duration_of_reemployment" variable right next to the last observed erw==11 spell.
For example in this data snippet, the value observed in spell nr. 10 (identified by nspell) for "time_until" is 1461. This is the largest value before the variable resets to 0. So, I would like to put this value next to spell nr. 6 in the generated variable "duration_of_reemployment".
The _seq is made in the following logic: There are 3 kinds of unemployment spells (erw==11, 12 & 15; quelle==2) and every unemployment spell is 0. Every spell since the unemployment spells is sequenced 1,2,3,...and so on. The _seq counter resets to 0 every time there is an unemployment spell. 1000 is attributed to those spells that come before the first unemployment (or _seq==0) spell in the individual's journey. While _seq can be 0 for every form of unemployment spells, I am interested in placing the maximum value from "time_until" to "duration_of_reemployment" only for the unemployment spells that are erw==11 (i.e., unemployment spells with benefits). The rest can stay missing and I do not care.
Intuition: "time_until" resets to 0 only for unemployment spells (not just erw==11, but also 12 and 15). So, what this variable does is cumulatively count the duration (number of days) between two unemployment spells - starting with the duration of the first spell of reemployment (eg. spell nr 6; _seq nr. 0) all the way to the last reemployment spell until the individual re-exits into unemployment (spell nr. 11, _seq nr. 0 again). If you have a simpler way of solving this problem, that would also be great.
Thanks a lot!
Code:
* Example generated by -dataex-. For more info, type help dataex clear input long(persnr_siab betnr_siab) float(nspell betnr_dau) int _seq float(episode_length_modified time_until_last_ue rank) int(erwstat begorig endorig) 2077314 1785975 1 0 1000 97 97 . 102 11591 11687 2077314 1785975 2 1 1000 366 463 . 101 11688 12053 2077314 1785975 3 1 1000 365 828 . 101 12054 12418 2077314 1785975 4 1 1000 199 1027 . 101 12419 12617 2077314 1785975 5 2 1000 10 1037 . 101 12618 12627 2077314 .n 6 999 0 0 0 1 11 12628 12783 2077314 1937532 7 0 1 365 365 . 101 12784 13148 2077314 1937532 8 1 2 366 731 . 101 13149 13514 2077314 1937532 9 1 3 365 1096 . 101 13515 13879 2077314 1937532 10 2 4 365 1461 . 101 13880 14244 2077314 .n 11 999 0 0 0 . 12 14245 14402 2077314 2176227 12 0 1 207 207 . 101 14403 14609 2077314 2176227 13 2 2 174 381 . 101 14610 14783 2077314 .n 14 999 0 0 0 2 11 14784 14883 2077314 1546868 15 0 1 92 92 . 109 14884 14975 2077314 1546868 16 1 2 365 457 . 101 14976 15340 2077314 1546868 17 1 3 365 822 . 102 15341 15705 2077314 1546868 18 1 4 365 1187 . 101 15706 16070 2077314 1546868 19 2 5 366 1553 . 101 16071 16436 end format %tdD_m_CY begorig format %tdD_m_CY endorig label values betnr_siab miss_en label def miss_en .n ".n N/A", modify label values betnr_dau lbl_prev label def lbl_prev 0 "First spell in the company", modify label def lbl_prev 1 "One of middle spells in the company", modify label def lbl_prev 2 "Last spell in the company", modify label def lbl_prev 3 "One-off", modify (not seen in case of this individual) label def lbl_prev 999 "Unemployment Spell", modify label values erwstat erwstat_en label def erwstat_en 11 "11 ALG Unemployment benefits", modify label def erwstat_en 12 "12 ALHI Unemployment assistance", modify label def erwstat_en 101 "101 Employees liable to social security without special characteristics", modify label def erwstat_en 102 "102 Trainees without special characteristics", modify label def erwstat_en 109 "109 Marginal part-time workers", modify
This is for one individual (2077314). betnr_siab is the company ID; betnr_dau is the spell status for that company (if it is the first or the last spell in this company, or perhaps one of the middle spells, or maybe just a one-off spell for a company. One-off spells would be employment spells where the individual works for the company only for one spell (typically a year as per the dataset, but can be more or less). To create the variable "time_until_last_ue" I used the following code:
Code:
* Exiting reemployment into another UI (time between two unemployment spells) *------------------------------------------------------------------------------- * label what spell is the first in a company, the subsequent, and the last in the company. Also label the unemployment spells gsort persnr betnr nspell begorig order persn betnr nspell by persn: gen prev_same = 0 if betnr == betnr[_n+1] & persn != persn[_n-1] order prev_same, after (betnr) by persn: replace prev_same = 0 if betnr != betnr[_n-1] & prev_same == . & betnr_siab != .n by persn: replace prev_same = 1 if betnr == betnr[_n-1] & prev_same == . & betnr_siab != .n replace prev_same = 999 if quelle == 2 & prev_same == . by persnr: replace prev_same = 2 if betnr == betnr[_n-1] & betnr != betnr[_n+1] & quelle != 2 //after the last line, no more missing by persnr: replace prev_same = 3 if prev_same == 0 & prev_same[_n+1] == 0 label define lbl_prev /// 0 "First spell in the company" /// 1 "One of middle spells in the company" /// 2 "Last spell in the company" /// 3 "One-off" /// 999 "Unemployment Spell" label values prev_same lbl_prev ren prev_same betnr_dau gsort persnr nspell begorig tab betnr_dau, m * create a sequence that counts 1, 2, 3, and onwards, from every unemployment spell (11, 12, etc.) as 0. gen betnr_date = begorig if betnr_dau == 999 order betnr_date, after (betnr_dau) format betnr_date %tdDD_Mon_CCYY bysort persnr_siab (begorig) : replace betnr_date = betnr_date[_n-1] if missing(betnr_date) gen wanted = begorig - betnr_date tsset persnr_siab nspell tsspell, fcond(betnr_dau == 999) order _spell _seq _end, after (betnr_date) replace _seq = _seq - 1 if _seq tsset, clear by persn: replace _seq = 1000 if _seq == 0 & betnr_dau != 999 drop _spell _end betnr_date sort persnr_siab nspell begorig * create a variable to indicate new sequences starting with _seq == 0 gen new_sequence = (_seq == 0) order new_sequence, after (_s) * create a running sum of episode_lengths within each sequence by persnr_siab: gen sequence_id = sum(new_sequence) order sequence_id, after (new_sequence) * generate a modified episode_length variable where the episode length is 0 when the _seq is 0. by persnr: gen episode_length_modified = episode_length *order episode_length_modified episode_length, before (nspell) replace episode_length_modified = 0 if _seq == 0 * calculate the cumulative sum of episode_lengths within each sequence bysort persnr_siab sequence_id (nspell): gen time_until_last_ue = sum(episode_length_modified) order betnr_dau _s episode_length_modified time_until_last_ue, after (nspell) * clean up drop new_sequence sequence_id sort persn nspell begorig
For example in this data snippet, the value observed in spell nr. 10 (identified by nspell) for "time_until" is 1461. This is the largest value before the variable resets to 0. So, I would like to put this value next to spell nr. 6 in the generated variable "duration_of_reemployment".
The _seq is made in the following logic: There are 3 kinds of unemployment spells (erw==11, 12 & 15; quelle==2) and every unemployment spell is 0. Every spell since the unemployment spells is sequenced 1,2,3,...and so on. The _seq counter resets to 0 every time there is an unemployment spell. 1000 is attributed to those spells that come before the first unemployment (or _seq==0) spell in the individual's journey. While _seq can be 0 for every form of unemployment spells, I am interested in placing the maximum value from "time_until" to "duration_of_reemployment" only for the unemployment spells that are erw==11 (i.e., unemployment spells with benefits). The rest can stay missing and I do not care.
Intuition: "time_until" resets to 0 only for unemployment spells (not just erw==11, but also 12 and 15). So, what this variable does is cumulatively count the duration (number of days) between two unemployment spells - starting with the duration of the first spell of reemployment (eg. spell nr 6; _seq nr. 0) all the way to the last reemployment spell until the individual re-exits into unemployment (spell nr. 11, _seq nr. 0 again). If you have a simpler way of solving this problem, that would also be great.
Thanks a lot!
Comment