Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Taking a value of the observation of one variable and placing it as a value of another observation in another variable

    This is what my spell data set looks like:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long(persnr_siab betnr_siab) float(nspell betnr_dau) int _seq float(episode_length_modified time_until_last_ue rank) int(erwstat begorig endorig)
    2077314 1785975  1   0 1000  97   97 . 102 11591 11687
    2077314 1785975  2   1 1000 366  463 . 101 11688 12053
    2077314 1785975  3   1 1000 365  828 . 101 12054 12418
    2077314 1785975  4   1 1000 199 1027 . 101 12419 12617
    2077314 1785975  5   2 1000  10 1037 . 101 12618 12627
    2077314      .n  6 999    0   0    0 1  11 12628 12783
    2077314 1937532  7   0    1 365  365 . 101 12784 13148
    2077314 1937532  8   1    2 366  731 . 101 13149 13514
    2077314 1937532  9   1    3 365 1096 . 101 13515 13879
    2077314 1937532 10   2    4 365 1461 . 101 13880 14244
    2077314      .n 11 999    0   0    0 .  12 14245 14402
    2077314 2176227 12   0    1 207  207 . 101 14403 14609
    2077314 2176227 13   2    2 174  381 . 101 14610 14783
    2077314      .n 14 999    0   0    0 2  11 14784 14883
    2077314 1546868 15   0    1  92   92 . 109 14884 14975
    2077314 1546868 16   1    2 365  457 . 101 14976 15340
    2077314 1546868 17   1    3 365  822 . 102 15341 15705
    2077314 1546868 18   1    4 365 1187 . 101 15706 16070
    2077314 1546868 19   2    5 366 1553 . 101 16071 16436
    end
    format %tdD_m_CY begorig
    format %tdD_m_CY endorig
    label values betnr_siab miss_en
    label def miss_en .n ".n N/A", modify
    label values betnr_dau lbl_prev
    label def lbl_prev 0 "First spell in the company", modify
    label def lbl_prev 1 "One of middle spells in the company", modify
    label def lbl_prev 2 "Last spell in the company", modify
    label def lbl_prev 3 "One-off", modify (not seen in case of this individual)
    label def lbl_prev 999 "Unemployment Spell", modify
    label values erwstat erwstat_en
    label def erwstat_en 11 "11 ALG Unemployment benefits", modify
    label def erwstat_en 12 "12 ALHI Unemployment assistance", modify
    label def erwstat_en 101 "101 Employees liable to social security without special characteristics", modify
    label def erwstat_en 102 "102 Trainees without special characteristics", modify
    label def erwstat_en 109 "109 Marginal part-time workers", modify

    This is for one individual (2077314). betnr_siab is the company ID; betnr_dau is the spell status for that company (if it is the first or the last spell in this company, or perhaps one of the middle spells, or maybe just a one-off spell for a company. One-off spells would be employment spells where the individual works for the company only for one spell (typically a year as per the dataset, but can be more or less). To create the variable "time_until_last_ue" I used the following code:

    Code:
        *  Exiting reemployment into another UI (time between two unemployment spells)
        *-------------------------------------------------------------------------------
        
            * label what spell is the first in a company, the subsequent, and the last in the company. Also label the unemployment spells
            gsort persnr betnr nspell begorig
            order persn betnr nspell
            by persn: gen prev_same = 0 if betnr == betnr[_n+1] & persn != persn[_n-1]
            order prev_same, after (betnr)
            by persn: replace prev_same = 0 if betnr != betnr[_n-1] & prev_same == . & betnr_siab != .n
            by persn: replace prev_same = 1 if betnr == betnr[_n-1] & prev_same == . & betnr_siab != .n
            replace prev_same = 999 if quelle == 2 & prev_same == .
            by persnr: replace prev_same = 2 if betnr == betnr[_n-1] & betnr != betnr[_n+1] & quelle != 2 //after the last line, no more missing
            by persnr: replace prev_same = 3 if prev_same == 0 & prev_same[_n+1] == 0
            label define lbl_prev ///
                0 "First spell in the company" ///
                1 "One of middle spells in the company" ///
                2 "Last spell in the company" ///
                3 "One-off" ///
                999 "Unemployment Spell"
            label values prev_same lbl_prev
            ren prev_same betnr_dau
            gsort persnr nspell begorig
            tab betnr_dau, m
        
            * create a sequence that counts 1, 2, 3, and onwards, from every unemployment spell (11, 12, etc.) as 0.
            gen betnr_date = begorig if betnr_dau == 999
            order betnr_date, after (betnr_dau)
            format betnr_date %tdDD_Mon_CCYY
            bysort persnr_siab (begorig) : replace betnr_date = betnr_date[_n-1] if missing(betnr_date)
            gen wanted = begorig - betnr_date
    
            tsset persnr_siab nspell  
            tsspell, fcond(betnr_dau == 999)
            order _spell _seq _end, after (betnr_date)
            replace _seq = _seq - 1 if _seq
            tsset, clear
            by persn: replace _seq = 1000 if _seq == 0 & betnr_dau != 999
            drop _spell _end betnr_date
            sort persnr_siab nspell begorig
    
            * create a variable to indicate new sequences starting with _seq == 0
            gen new_sequence = (_seq == 0)
            order new_sequence, after (_s)
            * create a running sum of episode_lengths within each sequence
            by persnr_siab: gen sequence_id = sum(new_sequence)
            order sequence_id, after (new_sequence)
            * generate a modified episode_length variable where the episode length is 0 when the _seq is 0.
            by persnr: gen episode_length_modified = episode_length
            *order episode_length_modified episode_length, before (nspell)
            replace episode_length_modified = 0 if _seq == 0
            * calculate the cumulative sum of episode_lengths within each sequence
            bysort persnr_siab sequence_id (nspell): gen time_until_last_ue = sum(episode_length_modified)
            order betnr_dau _s episode_length_modified time_until_last_ue, after (nspell)
            * clean up
            drop new_sequence sequence_id
            sort persn nspell begorig
    What I now need to do is a little complicated for me. I need to generate a variable called "duration_of_reemployment" that holds a value for each Unemp Insurance spell (erw==11). I need to pick the maximum value from the "time_until_last_ue" variable, which is right before it resets to zero, and place it in the "duration_of_reemployment" variable right next to the last observed erw==11 spell.
    For example in this data snippet, the value observed in spell nr. 10 (identified by nspell) for "time_until" is 1461. This is the largest value before the variable resets to 0. So, I would like to put this value next to spell nr. 6 in the generated variable "duration_of_reemployment".

    The _seq is made in the following logic: There are 3 kinds of unemployment spells (erw==11, 12 & 15; quelle==2) and every unemployment spell is 0. Every spell since the unemployment spells is sequenced 1,2,3,...and so on. The _seq counter resets to 0 every time there is an unemployment spell. 1000 is attributed to those spells that come before the first unemployment (or _seq==0) spell in the individual's journey. While _seq can be 0 for every form of unemployment spells, I am interested in placing the maximum value from "time_until" to "duration_of_reemployment" only for the unemployment spells that are erw==11 (i.e., unemployment spells with benefits). The rest can stay missing and I do not care.

    Intuition: "time_until" resets to 0 only for unemployment spells (not just erw==11, but also 12 and 15). So, what this variable does is cumulatively count the duration (number of days) between two unemployment spells - starting with the duration of the first spell of reemployment (eg. spell nr 6; _seq nr. 0) all the way to the last reemployment spell until the individual re-exits into unemployment (spell nr. 11, _seq nr. 0 again). If you have a simpler way of solving this problem, that would also be great.

    Thanks a lot!
    Last edited by Adrij Chakraborty; 01 Aug 2024, 07:47. Reason: Oversight in terms of data description

  • #2
    Would this do it?

    Code:
    sort persnr_siab nspell
    gen ue_spell = sum(_seq == 0)
    bys persnr_siab ue_spell (nspell): egen wanted = max(time_until_last_ue)
    replace wanted = . if erwstat != 11
    drop ue_spell

    Comment


    • #3
      Hi Hemanshu, I'll try your code. I have managed to do it myself using the following way (it's a little lengthy):

      Code:
              gen betnr_date = begorig if betnr_dau == 999 
              order betnr_date, after (betnr_dau)
              format betnr_date %tdDD_Mon_CCYY
              bysort persnr_siab (begorig) : replace betnr_date = betnr_date[_n-1] if missing(betnr_date)
              gen wanted = begorig - betnr_date
      
              tsset persnr_siab nspell  
              tsspell, fcond(betnr_dau == 999) 
              order _spell _seq _end, after (betnr_date)
              replace _seq = _seq - 1 if _seq 
              tsset, clear
              by persn: replace _seq = 1000 if _seq == 0 & betnr_dau != 999
              drop _spell _end betnr_date
              sort persnr_siab nspell begorig 
      
              gen new_sequence = (_seq == 0)
              order new_sequence, after (_s)
              by persnr_siab: gen sequence_id = sum(new_sequence)
              order sequence_id, after (new_sequence)
              by persnr: gen episode_length_modified = episode_length 
              replace episode_length_modified = 0 if _seq == 0
              bysort persnr_siab sequence_id (nspell): gen time_until_last_ue = sum(episode_length_modified)
              order betnr_dau _s episode_length_modified time_until_last_ue, after (nspell)
      
              drop new_sequence sequence_id
              sort persn nspell begorig
        
              sort persnr_siab nspell begorig
              gen reempd = .
              order reempd, after (time_until)
              gen seq_group = .
              order seq_group, after (_s)
              bysort persnr_siab (nspell): gen spell_start = _s == 0
              order spell_start, after (seq_group)
              bysort persnr_siab (nspell): replace seq_group = sum(spell_start)
              bysort persnr_siab seq_group: egen max_time_until = max(time_until)
              order max_time, last
              bysort persnr_siab seq_group: replace reempd = max_time_until if erw==11
      
      ​​​​​​​        drop max_time_until spell_start seq_group

      Comment

      Working...
      X