Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a time varying variable

    Dear Statalist,

    I want to generate a time varying variable called "employment status" to find out father's employment status at the time of first birth.

    for example:
    person 1
    during the time he/she is working, status = employed
    during the time he/she is not working (between 2003 Jan and 2004 October), status = unemployed

    The variables that I have are:

    person
    job start year
    job start month
    job end year
    job end month
    job search period
    job search start year
    job search start month
    job search end year
    job search end month
    job search duration
    job search route

    How can I approach this?

    I have attached a sample of my data set with this post.

    Thank you.

    Best,
    Bibek
    Attached Files

  • #2
    My code:

    Code:
    gen empstatus=.
    replace empstatus=1 if jobsearchp==1 //'no' job search period
    replace empstatus=0 if jobsearchp==2 //'Yes* job search period
    label define empstatus 1"employed" 0"unemployed"
    label values empstatus empstatus
    tab empstatus
    
    gen duration=.
    replace duration=jobstarty-jobendm & jobstartm-jobendm if empstatus==1

    Comment


    • #3
      Can you explain your data better? First, as far as I can tell the jobsearch* variables are irrelevant to the problem at hand. Is that correct?

      Second, you have a large number of duplicate observations. Why?

      Third, what does it mean when an observation shows a jobstart date but no job end date: how long did that job last? Or was there really no job then?

      Fourth, how do we show in the resulting data set that somebody was unemployed "between 2003 Jan and 2004 October". Does the designation for unemployment appear in the 6th observation or the 7th (or both) of your data set?

      Comment


      • #4
        Hi,

        thank you for your reply.

        The job search variables are there in the data set to measure how long an individual was unemployed or was in between jobs.
        There are a lot of duplicates because I am tracking their job info wave by wave.
        If there is a jobstart date but not the end it might mean that the individual is still employed at the same job?
        the data that i uploaded might not have been the best. There are two more variables job wave and job sequence that i failed to add. I tried to upload a bit larger sample but i got an error from the forum.

        The data set that I am using is retrospective . it follows the members of the same household over the Waves/ years.

        In your opinion how should i create the time varying variable?

        Best,
        Bibek



        Comment


        • #5
          Thank you for replying, but my questions remain largely unanswered. Here's what I think you need to do. If you do not already have -dataex- installed on your computer, run -ssc install dataex-. Then load your data set into Stata. -keep- the variables that are relevant to the problem here (probably that means including job wave and job sequence--maybe others?). And trim down the number of observations to a manageable number, but enough that they represent the variety of your data. Then run -dataex- to create some Stata code that will enable others to replicate this example data. Copy that code (including the code delimters) directly from the Results window and paste it into the Forum editor. Now, use the Stata data editor to create by hand a new data set that is what you want to get from the example data set. Again run -dataex-, and again copy Stata's response from the Results window and paste into the Forum editor. If you choose your data example well and share it in this way, somebody will probably be able to figure out how to get from what you have to what you want.

          Comment


          • #6
            Hi,

            I have attached a sample of my data file with this post.
            I want to generate a time varying variable called "employment status" to find out father's employment status at the time of first birth.

            for example:
            person 1
            during the time he/she is working, status = employed
            during the time he/she is not working status = unemployed

            Below is my code:
            Code:
            gen DURATION=jobstarty-jobendy if jobseq>=1
            replace DURATION=jobsearchstarty-jobsearchendy if jobseq<1 | jobseq>=1
            
            gen EVENT=.
            replace EVENT=0 if jobseq<1 | jobseq>=1
            replace EVENT=1 if jobseq>=1
            label define EVENT 0"unemployed" 1"employed"
            label values EVENT EVENT
            
            stset DURATION, failure(EVENT) origin(jobstarty) exit(jobendy) id(pid) 
            stset DURATION, failure(EVENT) origin(jobsearchstarty) exit(jobsearchendy) id(pid)
            Thank you.
            Attached Files

            Comment


            • #7
              Hi,
              I am still stuck in this problem. Any suggestions would be of great help
              Below is my code that i have been working on
              Code:
              gen empstatus=.
              replace empstatus=1 if jobseq>=1
              replace empstatus=0 if jobsearchp==2
              label define empstatus 1"employed" 0"unemployed"
              label values empstatus empstatus
              
              gen DURATION=jobendy-jobstarty & jobendm-jobstartm if empstatus==1 & jobendy!=.
              replace DURATION=jobsearchendy-jobsearchstarty & jobsearchendm-jobsearchendm if empstatus==0 & jobsearchstarty!=.
              thank you.
              Best,
              Bibek

              Comment


              • #8
                I still do not follow how you are determining when a person is employed and when not. You have never explained it in words, and the various examples of code you have tried contradict each other. You need to be clear about that. Once you are clear about that, I suspect you will be able to write the code yourself.

                I can give you some constructive advice about calculating durations. The separate variables for months and years just make life hard for you. I recommend creating Stata dailydate variables (the number of months elapsed since January 1960) to handle these.

                Code:
                gen int jobstartdate = mdy(jobstartm, jobstartd, jobstarty)
                gen int jobenddate = mdy(jobendm, jobendd, jobendy)
                format jobstartdate jobenddate %td
                gen jobduration = jobenddate-jobstartdate
                label var jobduration "Duration of job (days)"

                For the job searches you seem to have only month and year, without a specific date. So for that I would do:
                Code:
                gen int jobsearchstartmonth = mofd(mdy(jobsesarchstartm, 1, jobsearchstarty))
                gen int jobsearchendmonth = mofd(mdy(jobsearchendm, 1, jobsearchendy))
                format jobsearchstartmonth jobsearchendmonth %tm
                replace jobsearchduration = jobsearchendmonth - jobsearchstartmonth
                label var jobsearchduration "Duration of job search (months)"
                Note: If the job durations are long, measuring them in days may be inconvenient. You can, of course, convert to years by dividing the duration by 365.25. Also, since the job search durations are in months it might be sensible to have the job durations in that same unit. That can be done by multiplying by 12/365.25. (Change the variable label accordingly so you don't confuse yourself.)

                Many of your observations have time information about job starts with no corresponding information about job ends. So I have no idea how you expect to calculate durations for those jobs.

                I hope what I've shown you will help you with at least that piece of your project.

                Comment


                • #9
                  Thank you so much. I will try to create the variable following your suggestions.

                  Best,
                  Bibek

                  Comment


                  • #10
                    Hi,

                    With the help of your suggestion I managed to create the variable. However, since there were information missing on job search period for a lot of individuals I could not get the correct information of everyone.
                    Now,
                    I want to create an employment status variable using only pid, jobseq, jobstart y/m and job end y/m.Example of my data set is given in the table below. I want to indicate the period when they are employed as "employed", and the period( after the jobendy/m of jobseq 1 and before the jobstarty/m of jobseq 2 ) when they are out of employment as "unemployed". How can I code that?
                    pid jobseq jobstarty jobstartm jobendy jobendm
                    101 1 1988 10 . .
                    101 1 1988 10 2001 6
                    101 2 2004 6 2006 8
                    102 1 1989 4 2004 6
                    102 2 2005 5 . .
                    thank you.

                    Best,
                    Bibek

                    Comment


                    • #11
                      Dear Statalist,

                      To clarify a bit more of above issue. In above table there is
                      pid101
                      jobendy/m (June 2001) of past job (jobseq1) and jobstarty/m (June 2004) of new job (jobseq2) .

                      There is a gap of 4 years in between jobseq1 and jobseq 2 I want Stata to code that gap as "unemployed". And if an individual has started a new job as soon as he ended his/her previous job or continued with the previous and hasn't ended it yet then i want it to be coded as "employed"

                      Thank you in advance.

                      Best,
                      Bibek


                      Comment


                      • #12
                        I appreciate your attempt, in response to William Lisowski's advice on the other thread with this topic, to better explain what you want. But I still don't understand it, and given the lack of response from others, I imagine nobody else does either.

                        Don't tell us more about what you want. SHOW what you want. Hand-calculate and post a table that looks like what you would want the output to look like if we started from the table you show in #10.

                        Comment


                        • #13
                          I agree with Clyde's request, which is the same as mine from the other thread.

                          Perhaps other Statalist members would understand the problem better if you were to post, for the data you show, the results of calculating by hand what you would expect the employment status variable to be.
                          In particular, you say you want Stata to code that gap as "unemployed" but we do not see any data to which that coding could be applied.

                          Let me guess, in the hope that we can move this discussion forward: Is it your intent that the results of doing what you request with the data you show us should then be applied to a different dataset that has pid and date values? Is that what we have not been grasping?

                          If that is the case, we need to see that data for pid 101 and 102.


                          If you do not have dataex installed already, run ssc install dataex; then read help dataex to learn how to use it; then use dataex to re-post the data you posted in #10 and the data to which you want to apply the coding.

                          Comment


                          • #14
                            The variables presented on the table above for pid 101 and 102 are the only ones that I have in order to create a "employment status" variable. I am to base my judgement of creating that variable on the job sequence and job starting year/month and ending year/month. All other periods are considered as unemployed.

                            I am limited with the variables in order to code for employment status hence the problem.
                            Last edited by Bibek Sharma; 10 Feb 2016, 13:52.

                            Comment


                            • #15
                              Dear Statalist,


                              pid jobseq jobstarty jobstartm jobendy jobendm
                              101 1 1988 10 . .
                              101 1 1988 10 2001 6
                              101 2 2004 6 2006 8
                              102 1 1989 4 2004 6
                              102 2 2005 5 . .
                              Above is the table I have now and below is table that I want my output to look like after creating "employment status" variable.

                              pid jobseq jobstarty/jobsearchy jobstartm/jobsearchm jobendy/jobsearchendy jobendm/jobsearchendm Employment Status
                              101 1 1988 10 2001 6 employed
                              101 . 2001 6 2004 6 unemployed
                              101 2 2004 6 2006 8 employed
                              102 1 1989 4 2004 6 employed
                              102 . 2004 6 2005 5 unemployed
                              102 2 2005 5 interview time employed
                              Thank you.

                              Best,
                              Bibek

                              Comment

                              Working...
                              X