Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating spell length

    I am using Stata 14.

    My current panel dataset is in person-year-format (long). I only look at respondents who have changed from employment status A to B. I would like to know how long the respondents have been in status A. For that I have monthly spell data, in the form of 12 variables for every year, where 1 indicates being in status A and -2 not being in status A.

    I am not sure how to calculate the exact spell length. My first thought was to change the dataset into person-month-format and following that simply counting consecutive months in status A.

    So, like this:
    Code:
    reshape long d0, i(pid syear) j(month)

    However, the timing of the interviews is different in between respondents with regard to the specific month. So maybe I should consider that. I am unsure whether this is the right approach, and in particular how to include the timing of the interview in the right month.


    This is my monthly spell data:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long pid int syear byte(d001 d002 d003 d004 d005 d006 d007 d008 d009 d010 d011 d012)
    602 2000 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2
    602 2001 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2
    602 2002 -2 -2  1  1  1  1  1  1  -2  1  1  1
    602 2003  1  1  1  1  1  -2  -2  1  1  1  -2  1
    602 2004 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2 -2
    end
    The data is of retrospective nature. So the input in 2002 relates to January to December 2001.


    My current dataset looks like this:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long pid int syear byte pmonin float age byte sex
    8605 2003 2 24 1
    8605 2005 2 26 1
    8605 2007 2 28 1
    8605 2008 4 29 1
    9002 1986 3 20 1
    9002 1989 2 23 1
    9201 1990 3 35 0
    9201 2000 2 45 0
    9203 2001 2 21 0
    9203 2007 3 26 0
    9204 2006 3 23 1
    9205 1993 2 33 1
    9302 1987 3 23 0
    9302 1989 3 25 0
    9401 1993 4 32 0
    9401 1998 2 37 0
    9801 1993 4 35 0
    9801 2003 5 45 0
    9801 2006 6 48 0
    9802 1985 3 27 1
    9803 1989 2 29 1
    end
    label values pmonin interview_month
    label def pmonin 2 "[2] February", modify
    label def pmonin 3 "[3] March", modify
    label def pmonin 4 "[4] April", modify
    label def pmonin 5 "[5] May", modify
    label def pmonin 6 "[6] June", modify
    label values sex sex
    label def sex 0 "fem", modify
    label def sex 1 "male", modify


    In the end, I would like to have the last line:
    pid syear transitioned from A to B covariates length of A spell
    602 2002 1 x 6
    602 2003 1 x 8
    602 2003 1 x 3
    602 2004 1 x 1
    603 ....
    Last edited by sladmin; 06 Feb 2018, 09:39. Reason: anonymize user

  • #2
    Googling "spell length in Stata" gives some choice options to look at. I personally would just change to wide data, concatenate to a single string variable, and operate on that with string or grep functions.

    http://www.stata-journal.com/sjpdf.h...iclenum=dm0029

    Comment


    • #3
      Interesting! I have tried around both with string and loops and finally got to the spell length using old -tsspell-. However, I still want to make sure I grab the last spell before the interview and this is still puzzling to me. Reshaped and with -tsspell- my dataset now looks like this:

      pmonin: month of interview
      event: transitioned in this month
      maxseq: length of spellin state A

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input long pid int syear byte(month pmonin event _spell) int _seq byte _end float maxseq
      7402 1992 12 5 0 3 40 0  .
      7402 1993  1 5 0 3 41 0  .
      7402 1993  2 5 0 3 42 0  .
      7402 1993  3 5 0 3 43 0  .
      7402 1993  4 5 0 3 44 1  .
      7402 1993  5 5 1 4  1 0  .
      7402 1993  6 5 1 4  2 0  .
      7402 1993  7 5 1 4  3 0  .
      7402 1993  8 5 1 4  4 0  .
      7402 1993  9 5 1 4  5 0  .
      7402 1993 10 5 1 4  6 0  .
      7402 1993 11 5 1 4  7 0  .
      7402 1993 12 5 1 4  8 0  .
      7402 1994  1 5 1 4  9 0  .
      7402 1994  2 5 1 4 10 0  .
      7402 1994  3 5 1 4 11 0  .
      7402 1994  4 5 1 4 12 0  .
      7402 1994  5 5 1 4 13 0  .
      7402 1994  6 5 1 4 14 0  .
      7402 1994  7 5 1 4 15 0  .
      7402 1994  8 5 1 4 16 0  .
      7402 1994  9 5 1 4 17 0  .
      7402 1994 10 5 1 4 18 0  .
      7402 1994 11 5 1 4 19 1 19
      7402 1994 12 5 0 5  1 0  .
      7402 1995  1 5 0 5  2 0  .
      7402 1995  2 5 0 5  3 0  .
      7402 1995  3 5 0 5  4 0  .
      7402 1995  4 5 0 5  5 0  .
      7402 1995  5 5 0 5  6 0  .
      7402 1995  6 5 0 5  7 0  .
      7402 1995  7 5 0 5  8 0  .
      7402 1995  8 5 0 5  9 0  .
      7402 1995  9 5 0 5 10 0  .
      7402 1995 10 5 0 5 11 0  .
      7402 1995 11 5 0 5 12 0  .
      7402 1995 12 5 0 5 13 0  .
      7402 1996  1 8 0 5 14 0  .
      7402 1996  2 8 0 5 15 0  .
      7402 1996  3 8 0 5 16 0  .
      7402 1996  4 8 0 5 17 0  .
      7402 1996  5 8 0 5 18 0  .
      7402 1996  6 8 0 5 19 0  .
      7402 1996  7 8 0 5 20 0  .
      7402 1996  8 8 0 5 21 0  .
      7402 1996  9 8 0 5 22 0  .
      7402 1996 10 8 0 5 23 0  .
      7402 1996 11 8 0 5 24 0  .
      7402 1996 12 8 0 5 25 0  .
      7402 1997  1 3 0 5 26 0  .
      7402 1997  2 3 0 5 27 0  .
      7402 1997  3 3 0 5 28 0  .
      7402 1997  4 3 0 5 29 0  .
      7402 1997  5 3 0 5 30 0  .
      7402 1997  6 3 0 5 31 0  .
      7402 1997  7 3 0 5 32 0  .
      7402 1997  8 3 0 5 33 0  .
      7402 1997  9 3 0 5 34 0  .
      7402 1997 10 3 0 5 35 0  .
      7402 1997 11 3 0 5 36 0  .
      7402 1997 12 3 0 5 37 0  .
      7402 1998  1 3 0 5 38 0  .
      7402 1998  2 3 0 5 39 0  .
      7402 1998  3 3 0 5 40 0  .
      7402 1998  4 3 0 5 41 0  .
      7402 1998  5 3 0 5 42 0  .
      7402 1998  6 3 0 5 43 0  .
      7402 1998  7 3 0 5 44 0  .
      7402 1998  8 3 0 5 45 0  .
      7402 1998  9 3 0 5 46 0  .
      7402 1998 10 3 0 5 47 0  .
      7402 1998 11 3 0 5 48 0  .
      7402 1998 12 3 0 5 49 0  .
      7402 1999  1 5 0 5 50 0  .
      7402 1999  2 5 0 5 51 0  .
      7402 1999  3 5 0 5 52 0  .
      7402 1999  4 5 0 5 53 0  .
      7402 1999  5 5 0 5 54 0  .
      7402 1999  6 5 0 5 55 0  .
      7402 1999  7 5 0 5 56 0  .
      7402 1999  8 5 0 5 57 1  .
      7402 1999  9 5 1 6  1 0  .
      7402 1999 10 5 1 6  2 0  .
      7402 1999 11 5 1 6  3 0  .
      7402 1999 12 5 1 6  4 1  4
      end
      label values pmonin pmonin
      label def pmonin 3 "[3] March", modify
      label def pmonin 4 "[4] April", modify
      label def pmonin 5 "[5] May", modify
      label def pmonin 8 "[8] August", modify

      So in a way I want to now make sure I take "maxseq" of the last spell in state A and then, I think, I should -compress- the dataset back into person-year format, because (I forgot to mention that) my dependent variable is only assessed every year at the time of the interview. Because e.g in the example above you can see that in year 1994 the person reported 19 months in state A until 1993 and was interviewed in May 1994. Of course whether there was more time in state A I can only see in the follow-up period for 1995.

      I am stuck at how to grab the last "maxseq" before the interview month and really confused on how to handle the data.

      I think that I need a variation of this code, which however only grabs the last (n-1) item, not the lastnonmissing item. But this code would take the last value and should add it to the month where the interview is happening and I would then add my all other covariates to this month and only keep this month. In theory, this sounds plausible, in practice, I am still lost.

      Code:
      sort pid year
      by pid: g prev = maxseq[_n-1] if month == pmonin
      Last edited by sladmin; 06 Feb 2018, 09:40. Reason: anonymize user

      Comment


      • #4
        "Old" tsspell is a sprightly 14-year-old, thank you. As in http://quoteinvestigator.com/2010/10...ld-cary-grant/

        How old Cary Grant?
        Old Cary Grant fine. How you?
        More seriously, I don't see how you applied tsspell (SSC). It requires a prior tsset and since you have monthly data, the most natural candidate is a monthly date variable, which I can't see. I don't understand what you're asking, but I can suggest that the last spell ends at a monthly date given by


        Code:
        gen mdate = ym(syear, month)
        egen  last_spell_end = max(mdate / _end), by(pid)
        format %tm mdate last_spell_end
        That may help.

        Comment

        Working...
        X