Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Maye Ehab
    started a topic Transitions to non-employment

    Transitions to non-employment

    Dear Statalisters,

    I am trying to set up the data in order to draw the Kaplan-Meier curve.

    I have retrospective data for the job history - particularly employment status (12 categories) and the start year - for the first job, second, third, fourth and current job.
    I need to calculate from this data the time to non-employment and the failure variable which is becoming non-employed being coded 1.

    Could you please guide me through how to do this?

    Below is an example from my dataset for your reference.
    q6101y represents the start year of the 1st job after leaving education and q6101_01 represents the employment status of this job.
    q61021y represents the start year of the 2nd job and q6102_01 represents the employment status of this job and so on until the current job in q6106

    These questions are only answered by the individuals who ever worked, i.e. answering the variable evrwrk=1.

    Please let me know if you need further elaboration on the data.

    Many thanks,
    Maye

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str11 indid byte evrwrk int q6101y byte q6101_01 int q6102y byte q6102_01 int q6103y byte q6103_01 int q6104y byte(q6104_01 q6105 q6106_01)
    "12010001101" 1 1982  1 1991  8    . .    . . 2 .
    "12010001102" 0    .  .    .  .    . .    . . . .
    "12010002102" 1 2004  1    .  .    . .    . . 2 .
    "12010002103" 1 2002  1    .  .    . .    . . 2 .
    "12010002104" 0    .  .    .  .    . .    . . . .
    "12010002101" 0    .  .    .  .    . .    . . . .
    "12010003103" 0    .  .    .  .    . .    . . . .
    "12010003101" 1 1980  1 1990 16 1993 6 1995 1 2 .
    "12010003102" 0    .  .    .  .    . .    . . . .
    "12010004103" 0    .  .    .  .    . .    . . . .
    "12010004104" 0    .  .    .  .    . .    . . . .
    "12010004101" 1 1986 16 1989  1    . .    . . 2 .
    "12010004102" 0    .  .    .  .    . .    . . . .
    "12010005102" 0    .  .    .  .    . .    . . . .
    "12010005101" 1 1997  1 1998  6 2000 1 2010 1 2 .
    "12010006101" 1 1985  1 1995  1    . .    . . 2 .
    "12010006102" 0    .  .    .  .    . .    . . . .
    "12010007103" 0    .  .    .  .    . .    . . . .
    "12010007104" 0    .  .    .  .    . .    . . . .
    "12010007105" 0    .  .    .  .    . .    . . . .
    "12010007101" 1 1982  1    .  .    . .    . . 2 .
    "12010007102" 0    .  .    .  .    . .    . . . .
    "12010008101" 1 1964  1 1975  1 1983 1 1984 1 2 .
    "12010008102" 0    .  .    .  .    . .    . . . .
    "12010009103" 0    .  .    .  .    . .    . . . .
    "12010009102" 0    .  .    .  .    . .    . . . .
    "12010009101" 1 1962  1 1988  1 1989 1 1990 1 2 .
    "12010010102" 0    .  .    .  .    . .    . . . .
    "12010010101" 1 2000 16 2003  7 2004 1 2006 1 2 .
    "12010011101" 1 1991  3 1995  1 1997 1    . . 2 .
    end
    label values q6101_01 Lmempst
    label values q6102_01 Lmempst
    label values q6103_01 Lmempst
    label values q6104_01 Lmempst
    label values q6106_01 Lmempst
    label def Lmempst 1 "Waged Employee", modify
    label def Lmempst 3 "Self-Employed", modify
    label def Lmempst 16 "Other", modify
    label def Lmempst 6 "Unemployed worked before", modify
    label def Lmempst 7 "New Unemployed", modify
    label def Lmempst 8 "Housewife", modify
    label values q6105 q605
    label def q605 2 "no", modify



  • Clyde Schechter
    replied
    I'm not sure I understand the circumstances under which you need to do this carry forward.

    Here's how you would simply forward fill all missing values from the last non-missing value:

    Code:
    by indid (year), sort: replace employment_type = employment_type[_n-1] ///
        if _n > 1 & !missing(year)
    It appears, however, that you want to do this only under certain circumstances that I do not understand. Perhaps you can use what I show here as a starting point and modify it to reflect your additional conditions. If not, please post back with a clearer explanation and a new data example that includes some where you want to forward fill and some where you don't and explaining which is which, and why.

    Leave a comment:


  • Maye Ehab
    replied
    Thanks for your reply.

    I did the reshape and creating the origin variable

    I have a follow-up question.
    I need to fill forward the employment type in case it was the second job and did not change throughout the survey time.
    i.e. the person was asked in 2004 about his/her employment status which was recorded for the second job; then he/she did not change his job. This shows in the data as a missing value in the year 2010, which is not true. Hence, it does not reflect correctly in the failure event and time to failure.

    Could you advise me on how to fill it forward?

    Many thanks,
    Maye

    Leave a comment:


  • Clyde Schechter
    replied
    So the first thing to do is to convert this to long layout. From there it is a simple matter of -stset-ing it as multiple observations per person. I think things would also be clearer if the variables were renamed in ways that have mnemonic/semantic value instead of by question numbers from a survey. I also assume that you are interested not in the calendar year in which unemployment occurs but rather in the elapsed time from beginning of observation to unemployment. I notice that your employment category variable has two different kinds of unemployment: in the code below, I consider both of them to represent an unemployment event.

    Code:
    reshape long q610@_01 q610@y, i(indid) j(jobnum)
    rename q610y year
    rename q610_01 employment_type
    
    by indid (year), sort: gen origin = year[1]
    
    
    stset year, failure(employment_type = 6 7) id(indid) origin(origin)
    
    sts graph
    Note: There is a variable q105 in your data example that I do not understand. Its name doesn't pattern with the others, and it apparently is a constant in the data. Assuming that it is irrelevant to the question at hand, this doesn't matter. Just calling it to your attention in case it is an error of some kind.

    Leave a comment:

Working...
X