Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating a dummy variable based on changes in employment status over time (panel data)

    Hi everyone,

    I'm looking to create a new dummy variable that takes the value of 1 if an individual exits unemployment (i.e. goes from being unemployed to being employed), and takes the value of 0 if an individual stays unemployed. I'm working with panel data that has a variable "Whether unemployed in month x" that indicates whether an individual was unemployed in January, February, March, etc. of a certain year, and this covers the whole of the time period of 2007-2015 (8 years) -- this means that there is one variable for each month of each year to indicate whether the individual was unemployed in said month. The "whether unemployed in month x" variable takes the value of 1 if the individual was unemployed in the month in question, and 0 if the individual was employed.

    What I have done so far is to isolate the variables that I am interested in, which are all of the "whether unemployed in month x" variables for the time period 2007-2015, as well as the family ID variables. I then reshaped the data to "long", and generated two variables to indicate the change of an individual's employment status:
    1. a variable, "exitue" which takes the value of 1 if an individual goes from unemployed in month t to employed in month t+1
    2. a variable, "stayue" which takes the value of 1 if an individual is unemployed in month t and stays unemployed in month t+1

    What I'm struggling to figure out now is how to generate the new dummy variable that I need, which takes the value of 1 if an individual exits unemployment and 0 if an individual stays unemployed.

    Here are the commands that I have used so far:

    reshape long wtrue, i(familyid) j(time)
    bys familyid: gen exitue=1 if(wtrue[_n]==1 & wtrue[_n+1]==0)
    bys familyid: gen stayunemp=1 if(wtrue[_n]==1 & wtrue[_n+1]==1)
    Any advice would be much appreciated!

  • #2
    Welcome to Statalist.

    Let me first point out that the code you show needs adjustment in the bysort:
    Code:
    bysort familyid (time): generate exitue=1 if (wtrue[_n]==1 & wtrue[_n+1]==0)
    bysort familyid (time): generate stayunemp=1 if (wtrue[_n]==1 & wtrue[_n+1]==1)
    Without including (time), you cannot be assured that within familyid, the data are sorted by increasing value of time.

    The two variables you generate unfortunately have the value missing or 1; for analytic purposes you want 0 or 1. Consider the following code - three variants of your command that produce the result I believe you want.
    Code:
    generate exitue = 0
    bysort familyid (time): replace exitue=1 if (wtrue[_n]==1 & wtrue[_n+1]==0)
    Code:
    bysort familyid (time): generate exitue = wtrue==1 & wtrue[_n+1]==0
    Code:
    bysort familyid (time): generate exitue = wtrue & ! wtrue[_n+1]
    The key is that "logical expressions" produce numeric results, just like other expressions.
    • logicals expression treat 0 as false and nonzero as true
    • the result of a logical expression is 0 if it is false and 1 if it is true
    The meaning of _n is "the current observation" as Stata applies your generate command to successive observations, so the subscript [_n] is unnecessary.

    Comment


    • #3
      Hi William,

      Thanks a lot for your help! I've modified my coding to incorporate your suggestions, and it's worked nicely.

      With regards to creating the new dummy variable (which I will call "empchange") that takes the value of 1 if an individual exits unemployment (i.e. goes from being unemployed to being employed), and takes the value of 0 if an individual stays unemployed, I've tried using the following command:

      Code:
      bysort familyid (time): gen empchange = exitue==1 & stayue==0
      Am I on the right track here?

      Comment


      • #4
        I'm sorry I was unclear: my intent was that any one of the three versions of exitue in post #2 will accomplish what you wanted for the "new dummy variable" in post #1. That is, a new variable is not necessary; a correct formulation of your proposed exitue variable will suffice.

        Here's an example that demonstrates all four possible combinations of wtrue & wtrue[_n+1]. I added one small change so that the new variable is defined as missing in he last wave for the familiyid because you don't know what happens in the wave after the last one you observe.
        Code:
        clear
        input int familyid time wtrue
        42 1 1
        42 2 1
        42 3 0
        42 4 0
        42 5 1
        end
        bysort familyid (time): generate exitue=1 if (wtrue[_n]==1 & wtrue[_n+1]==0)
        bysort familyid (time): generate stayunemp=1 if (wtrue[_n]==1 & wtrue[_n+1]==1)
        bysort familyid (time): generate newexitue = wtrue & ! wtrue[_n+1] if _n<_N
        list, clean noobs abbreviate(12)
        Code:
        . list, clean noobs abbreviate(12)
        
            familyid   time   wtrue   exitue   stayunemp   newexitue  
                  42      1       1        .           1           0  
                  42      2       1        1           .           1  
                  42      3       0        .           .           0  
                  42      4       0        .           .           0  
                  42      5       1        .           .           .

        Comment


        • #5
          Thanks William!

          Comment


          • #6
            Just as a follow-up question, I've realised that there is also the possibility of an individual having multiple separate instances of unemployment throughout the time period -- for example he could be unemployed for the first time from Oct 2007-Nov 2007, and another time from July 2008-Sept 2008 -- this would result in an individual potentially having multiple instances of changes in employment status. How can I modify my Stata commands to account for this?

            Comment


            • #7
              I think it's time to advise you to review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question.

              Your post #6 says nothing about how you want to account for multiple periods of unemployment and re-employment. As the code now stands, extending the sample data in post #4 yields the following.
              Code:
              . list, clean noobs abbreviate(12)
              
                  familyid   time   wtrue   exitue   stayunemp   newexitue  
                        42      1       1        .           1           0  
                        42      2       1        1           .           1  
                        42      3       0        .           .           0  
                        42      4       0        .           .           0  
                        42      5       1        .           1           0  
                        42      6       1        1           .           1  
                        42      7       0        .           .           0  
                        42      8       0        .           .           .
              What is it that you want to be changed, or added, in this?

              Comment


              • #8
                I rather wish post #6 had mentioned this new topic, from 15 minutes prior to post #6

                https://www.statalist.org/forums/for...ble-panel-data

                which appears to subsume the question of post #6 into a broader topic.

                Comment


                • #9
                  Hi William,

                  Thanks for the response -- I think I've managed to resolve the issue now, so please ignore my previous post. Apologies for being vague, I'll make an effort to include more information in future posts.

                  Comment

                  Working...
                  X