Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • New variable for change over time in a variable

    Good evening,

    I have a dataset with one variable called occupation with values such as paid employment, job seeker following job loss, pensioner, etc. This is shown for every individual for multiple months:
    individual month occupation
    1 january job seeker
    1 february job seeker
    1 march paid employment
    2 january job seeker
    2 february paid employment
    2 march paid employment
    3 january paid employment

    Now I want to create a new variable regarding unemployment duration. I want to calculate how many months were in between individuals' occupation as "job seeker" followed by "paid employment".
    So that means the unemployment duration for individual 1 was two months and the unemployment duration for individual 2 was one month, etc.

    What is the correct code to calculate this?

    Thank you very much!

  • #2
    Well, if the tableau you show really represents your data, and, in particular, the observations on each person are always consecutive months with no gaps, then it is very simple:

    Code:
    by individual (month), sort: egen total_unemployment = total(occupation == "job seeker")
    (Note: Assumes occupation is a string variable. If it's actually a value-labeled numeric variable, this code will throw a type mismatch error message.)

    If, however, as is typically true of real world data, some months' observations are missing for some individuals, then the above can produce incorrect results. In that case, the foremost obstacle will be converting the month variable into a numeric sequence and then doing some arithmetic. But before venturing into code for that, I need to know things about the data that cannot be exhibited in the kind of tableau you have shown. So, if the above code does not solve your problem, when you post back, be sure to use the -dataex- command and post an actual example from your Stata data set. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    The correct code will also depend on how you want to handle gaps in the data. If somebody is a job seeker in January, has no observation for February, and is in paid employment in March, what do you want the calculated duration of unemployment to be?
    Last edited by Clyde Schechter; 01 Jun 2022, 13:49.

    Comment


    • #3
      Hi Clyde,

      Thank you for your response. It is indeed a value-label numeric variable, but I can assume that would be fixed by changing job seeker to the number that is assigned to job seeker.
      For example, paid employment = 1, job seeker following job loss = 4, as also shown below.


      I have changed the data a bit and first want to know how many years are in between the change in occupation. And yes, I also have missing observations for the years and months for some individuals.


      I do have to say that I'm a new user to Stata!




      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input double nomem_encr float year double occupation
      800009 2015 11
      800009 2016 11
      800009 2017 11
      800009 2018 11
      800009 2019 11
      800009 2020 11
      800012 2015  1
      800015 2010  1
      800015 2011  1
      800015 2012  1
      800015 2013  1
      800015 2015  1
      800015 2016  1
      800015 2017  1
      800015 2018  1
      800015 2019  1
      800015 2020  1
      800018 2012  .
      800033 2010  7
      800033 2011  7
      800033 2012  7
      800033 2015  7
      800039 2017  .
      800042 2010  8
      800042 2011  8
      800042 2012  8
      800042 2013  8
      800042 2015  8
      800042 2016  8
      800042 2017  8
      800042 2018  1
      800042 2019  1
      800054 2015  9
      800054 2016  9
      800054 2017  9
      800054 2018  9
      800054 2019  9
      800057 2010  1
      800057 2011  1
      800057 2012  1
      800057 2013  .
      800057 2015  1
      800057 2016  1
      800057 2017  1
      800057 2018  1
      800057 2019  1
      800057 2020  1
      800058 2020  7
      800073 2017  .
      800073 2018 12
      800073 2019 12
      800085 2015  1
      800085 2016  .
      800085 2017  1
      800085 2018  1
      800085 2019  .
      800100 2015  7
      800100 2016  1
      800100 2017  4
      800100 2018  7
      800100 2019  7
      800100 2020  1
      800109 2012  1
      800115 2017  .
      800119 2010  6
      800119 2011  6
      800119 2012  6
      800119 2013  6
      800119 2015  6
      800119 2016  6
      800119 2017  6
      800119 2018  6
      800119 2019  9
      800119 2020  9
      800125 2010  1
      800127 2020  .
      800128 2017  .
      800128 2019 12
      800128 2020 12
      800131 2010  1
      800131 2011  1
      800131 2012  1
      800131 2013  1
      800131 2015  9
      800131 2016  9
      800131 2017  9
      800131 2018  9
      800131 2019  9
      800131 2020  9
      800151 2017  .
      800158 2010  .
      800158 2011  1
      800161 2012  .
      800161 2013  3
      800161 2015  3
      800161 2016  3
      800161 2017  3
      800161 2018  3
      800161 2019  .
      800161 2020  .
      end
      label values occupation cw10c525
      label def cw10c525 1 "paid employment", modify
      label def cw10c525 3 "autonomous professional, freelancer, or self-employed", modify
      label def cw10c525 4 "job seeker following job loss", modify
      label def cw10c525 6 "exempted from job seeking following job loss", modify
      label def cw10c525 7 "attends school or is studying", modify
      label def cw10c525 8 "takes care of the housekeeping", modify
      label def cw10c525 9 "is pensioner ([voluntary] early retirement, old age pension scheme)", modify
      label def cw10c525 11 "performs unpaid work while retaining unemployment benefit", modify
      label def cw10c525 12 "performs voluntary work", modify
      Last edited by Jill Groenewegen; 01 Jun 2022, 14:09.

      Comment


      • #4
        Thanks. Yes, you can patch the code in #2 by just replacing "job seeker" with the number 4.

        But, you didn't answer my question about how to count the missing years (months in the original). For example, there are observations in your data set where occupation is a missing value. Does that count as a month of unemployment or not? Does the answer depend on whether the closest preceding (or following) observation has occupation == 4? And what about situations where a year is just not instantiated for a person. For example individual 800015 has observations for years 2010, 2011, 2012, 2013, 2015, and 2016. But there is nothing for 2014. Should 2014 be considered a year of unemployment in a situation like this? And, again, does the answer depend on whether 2013 or 2015 is a year of unemployment or not?

        Comment

        Working...
        X