Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Spells in Groups in a Panel Context

    I've looked high & low for a solution to this, so my apologies if I'm repeating an issue from a different post/FAQ/Stata Journal. This journal article: http://www.stata-journal.com/sjpdf.h...iclenum=dm0029, among many other sources, has gotten me pretty far already, but I can't seem to fix one last bug/issue dealing with sequencing spells.

    I'm using Stata 13 SE and have a dataset that looks like:
    id collegeid Enroll Date End Date Grad Date EnrGrad_Dates
    1 2341 13387 13507 13387
    1 8789 13390 13510 13390
    1 2341 13537 13657 13537
    1 2341 13743 13863 13743
    1 2341 13942 14062 13942
    1 2341 14232 14352 14232
    1 2341 . . 14578 14578
    2 4563 17891 18011 17891
    2 4563 18002 18122 18002
    2 4563 18250 18370 18250
    2 4563 18353 18473 18353
    2 4563 19567 19624 19567
    3 . . . . .
    4 1289 18763 18883 18763
    4 1289 18987 19107 18987
    4 9089 20032 20152 20032
    4 9089 20198 20318 20198
    4 1289 20346 20466 20346
    My goal is to sequentially identify the enrollment/graduation patterns for each 'id'. As you can see, I've already sorted by id & EnrGrad_Dates and my date variables are in Stata format (e.g., days since 1/1/1960). I'm able to get most of the way there with:

    egen college_tag=group(id collegeid) if collegeid!=.
    by id: gen college_tag2=(college_tag!=[_n-1]) if collegeid!=.
    by id: gen college_seq=sum(college_tag2) if collegeid!=.
    sort id EnrGrad_Dates

    This gets me:
    id college id Enroll Date End Date Grad Date EnrGrad_Dates college_tag college_tag2 college_seq
    1 8789 13387 13507 13387 1 1 3
    1 1245 13390 13510 13390 2 1 1
    1 8789 13537 13657 13537 1 0 3
    1 2341 13743 13863 13743 1 0 2
    1 2341 13942 14062 13942 1 0 2
    1 2341 14232 14352 14232 1 0 2
    1 2341 . . 14578 14578 1 0 2
    2 4563 17891 18011 17891 3 1 1
    2 4563 18002 18122 18002 3 0 1
    2 4563 18250 18370 18250 3 0 1
    2 4563 18353 18473 18353 3 0 1
    2 4563 19567 19624 19567 3 0 1
    3 . . . . . . . .
    4 1289 18763 18883 18883 4 1 1
    4 1289 18987 19107 19107 4 0 1
    4 9089 20032 20152 20152 5 1 2
    4 9089 20198 20318 20318 5 0 2
    4 1289 20346 20466 20466 4 0 1

    Ideally, I'd like to have the college_seq variable arranged numerically by collegeid & EnrGrad_Dates, but if I try to bysort on collegeid & EnrGrad_Dates, it treats each date like it's own entity, which makes sense. If it wasn't the case that ids concurrently enroll in a college (two colleges--one term) or that they go back to previously enrolled in colleges, the normal approach to spells would work out great, but because I don't want an already attended (preceding row within an id) college to start a new spell, I struggle with the spell approach. Is there a way to use a spell approach that takes the repeated collegeid variable into account, or is there a way to manipulate my college_seq variable to get to the same place I'd like to. Here's what I'd want to have:



    id college id Enroll Date End Date Grad Date EnrGrad_Dates college_tag college_tag2 college_seq desired_seq
    1 8789 13387 13507 13387 1 1 3 1
    1 1245 13390 13510 13390 2 1 1 2
    1 8789 13537 13657 13537 1 0 3 1
    1 2341 13743 13863 13743 1 0 2 3
    1 2341 13942 14062 13942 1 0 2 3
    1 2341 14232 14352 14232 1 0 2 3
    1 2341 . . 14578 14578 1 0 2 3
    2 4563 17891 18011 17891 3 1 1 1
    2 4563 18002 18122 18002 3 0 1 1
    2 4563 18250 18370 18250 3 0 1 1
    2 4563 18353 18473 18353 3 0 1 1
    2 4563 19567 19624 19567 3 0 1 1
    3 . . . . . . . .
    4 1289 18763 18883 18883 4 1 1 1
    4 1289 18987 19107 19107 4 0 1 1
    4 9089 20032 20152 20152 5 1 2 2
    4 9089 20198 20318 20318 5 0 2 2
    4 1289 20346 20466 20466 4 0 1 1

    Thanks!








    Last edited by William Webster; 12 May 2016, 14:01.
Working...
X