Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Conerting spell/survival data into panel data

    Dear Stata users,

    I am currently working with panel and spell data. I have found a way to convert spell data into panel data format (long). The spell data distinguishes calendar and year data. I can convert it using the year data using the following command:

    gen duration = endy - beginy + 1
    expand duration
    by persnr, sort: gen year = beginy +_n -1
    bysort persnr (begin) : gen period = _n

    But in order to be precise as possible, I want to use the calendar data. See the following data example :

    id |spellnr |begin |end |spelltype
    1| 1 | [12] 1992 Mar| [14] 1992 Apr |[1] working full time
    2| 6 | [13] 1998 Apr | [16] 1998 Sep | [2] time out
    2| 7 | [17] 2002 Nov | [19] 2004 Jan | [1] working full time
    2| 8 | [21] 2004 Feb | [23] 2007 Jan | [2] time out

    The main issue here is the fact that the variables "begin" and "end" appears to be blue. Currently, I cannot address the date or something like that. If I try to calculate something with the data, the number in the brackets is used.
    Furthermore, I would like to have the data available in panel data format (long) with an variable idicating the the total time of "time out" up to each year for each individual.

    I am using Stata 11.

    Thanks to all for the help.

  • #2
    Hello "DanielIG" (please re-register with name and family name, after clicking on the "contact us" sign, below to the right).

    First, as you know, red is for string variables and blue is for labelled variables in Stata.

    Now, going directly to one of your problems:

    The main issue here is the fact that the variables "begin" and "end" appears to be blue. Currently, I cannot address the date or something like that.
    Here, I share with you what I've been currently doing. May someone have a better solution, I'd be glad to learn it! Please share with us.

    First, create a new variable (“newvar”) from an old variable(“oldvar”): . clonevar newvar = oldvar.

    Second, make it into a string variable named “datevar”: . decode newvar, gen(datevar)

    Third, since you have now a string variable, use the “date()” function and make it into a date variable named “bdate”: . gen bdate = date(datevar, “DMY”).

    Forth, you “format” the variable, that is, make it appear like a normal date, say 11jan2012: .format bdate %td

    Not to forget: the command "format" is just to make dates appear, well, like a "normal" date. In short, dates shall appear in black in your dataset, the same way as all variables (with the aforementioned exceptions).

    Hopefully that might be of some help.

    Best,

    Marcos
    Best regards,

    Marcos

    Comment


    • #3
      Hi Marcos,

      renaming is in process (at least, i sent a mail to address the issue).

      Thank your for your help. I tried the above example but it did not work. In the last step, I get only missings. But up to the last line, everything works fine. I think the format date cannot interpret the number in brackets ([14] for instance) . Somehow, I have to extract the year and the month.

      Maybe http://www.ats.ucla.edu/stat/stata/faq/regex.htm may help. I experimented a little bit with the syntax but couldn`t work it out. Maybe someone more experienced has an idea?

      Best,

      Daniel

      Comment


      • #4
        DanielG,
        as an aside to Marcos' useful insights, I would suggest you (as per FAQ) to post what you typed and what Stata gave you back, so that others on the list can see your problem (as per FAQ again, statements like "doesn't work" make us understand poster's disappointment but are poorly helpful for diagnose what went wrong along the way and advise accordingly).
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Hi to all,

          in oder to maintain completeness, I figured out how to manage the issue. Please find attached the respective code. While it is not the most parsimonious solution, it might help others. I generated a variable day which contains only ones in order to be able to use the "mdy(...)" command. If someone is using that routine, please consider that you also have to consider the respective day of the month. But since my data does not include this information, I refer to the first day of the month.

          the next point is as follows: I want to use the data to generate a variable which contains the total time spent in many spells of the same typ up to the respective year in consideration. This is due to the fact I have my panel in long format. I want to merge this variable with my data.

          Does anyone have a solution in mind? This would be nice!

          Best

          Daniel



          ************************************************** ************************************************** ***

          clonevar newbegin = begin
          decode newbegin, gen(datebegin)

          cap drop byear bmonth
          gen byear = regexs(0) if(regexm(datebegin, "[0-9][0-9][0-9][0-9]"))
          gen bmonth = regexs(0) if(regexm(datebegin, "[A-Z][a-z][a-z]"))

          cap drop bmonth2
          gen bmonth2 = .

          replace bmonth2=1 if bmonth == "Jan"
          replace bmonth2=2 if bmonth == "Feb"
          replace bmonth2=3 if bmonth == "Mar"
          replace bmonth2=4 if bmonth == "Apr"
          replace bmonth2=5 if bmonth == "Mai"
          replace bmonth2=6 if bmonth == "Jun"
          replace bmonth2=7 if bmonth == "Jul"
          replace bmonth2=8 if bmonth == "Aug"
          replace bmonth2=9 if bmonth == "Sep"
          replace bmonth2=10 if bmonth == "Okt"
          replace bmonth2=11 if bmonth == "Nov"
          replace bmonth2 = 12 if bmonth == "Dez"

          gen day = 1

          destring byear, replace

          cap drop date110
          gen date110 = mdy(bmonth2, day, byear)

          format date110 %td


          clonevar newend = end
          decode newend, gen(dateend)

          cap drop eyear emonth
          gen eyear = regexs(0) if(regexm(dateend, "[0-9][0-9][0-9][0-9]"))
          gen emonth = regexs(0) if(regexm(dateend, "[A-Z][a-z][a-z]"))

          cap drop emonth2
          gen emonth2 = .

          replace emonth2=1 if emonth == "Jan"
          replace emonth2=2 if emonth == "Feb"
          replace emonth2=3 if emonth == "Mar"
          replace emonth2=4 if emonth == "Apr"
          replace emonth2=5 if emonth == "Mai"
          replace emonth2=6 if emonth == "Jun"
          replace emonth2=7 if emonth == "Jul"
          replace emonth2=8 if emonth == "Aug"
          replace emonth2=9 if emonth == "Sep"
          replace emonth2=10 if emonth == "Okt"
          replace emonth2=11 if emonth == "Nov"
          replace emonth2 = 12 if emonth == "Dez"


          destring eyear, replace

          cap drop date220
          gen date220 = mdy(emonth2, day, eyear)

          format date220 %td

          ************************************************** ************************************************** ************

          Comment

          Working...
          X