Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create loop based on multiple qualifiers and panel dataset

    Hello everyone,

    Although I've been trying for several hours, I'm afraid I can't solve this one by myself. To be more specific, I have a dataset with a rotational design for the period 2005-2015, where the sample of each year consists of four subsamples, one that has been selected for the specific year and three others that have been followed for 2, 3 and 4 years, respectively. Every subsample is dropped after a four-year follow-up. Hence, I have (at most) four-year observations for each person (person_id) for variable wstatus (working status) that may or may not change during this period.


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float line int year long(hh_id person_id) byte wstatus
    64965 2006 138460 13846001 2
    64966 2007 138460 13846001 1
    64967 2008 138460 13846001 1
    64968 2009 138460 13846001 2
    64969 2006 138460 13846002 4
    64970 2007 138460 13846002 4
    64971 2008 138460 13846002 4
    64972 2009 138460 13846002 4
    64973 2006 138470 13847001 6
    64974 2007 138470 13847001 6
    64975 2008 138470 13847001 6
    64976 2009 138470 13847001 6
    64977 2006 138470 13847002 7
    64978 2007 138470 13847002 7
    64979 2008 138470 13847002 7
    64980 2009 138470 13847002 7
    64981 2006 138470 13847003 5
    64982 2007 138470 13847003 5
    64983 2008 138470 13847003 2
    64984 2009 138470 13847003 1
    end


    What I'm trying to do is to construct a loop which generates a new variable, say "trans", whose values depend on the change of the values of wstatus between each combination of two consecutive years. Hence, for the example above, and for person 13846001, the value of the said variable trans would be empty for 2006, while for 2007 would be based on the comparison of the values of wstatus for the years 2006 and 2007, for the year 2008 would be based on the comparison of the values of wstatus for the years 2007 and 2008, and for the year 2009 would be based on the comparison of the values of wstatus for the years 2008 and 2009. I don't think that the specific qualifiers are of much relevance, but for the sake of this example let's say that if wstatus==1 at t and wstatus==1 at t+1 for person i , then trans==100 at t+1 for the same person. I have another 10 combinations to consider.

    I did come up with something, but it doesn't work obviously, as it changes the values of all observations for each person

    Code:
    gen trans=.
    by person_id (year), sort: gen yid = _n
    summarize yid, meanonly
    forval i= 1/`r(max)' {
    by person_id: replace trans=100 if wstatus[`i']==1 & wstatus[`i'+1]==1
    }

    I could really use your help!

    Thank you in advance
    Thanos

    edit: I don't want to change the form of data from long to wide
    Last edited by Thanos Chantzaras; 24 Mar 2019, 13:03.

  • #2
    Well, if there are 7 different possible values of wstatus (and perhaps there are even more) then you have to consider a total of 49 possible transitions. Perhaps you might want to consider n-to-n as 0 in all cases, but then that still leaves you with 42 other transitions. So you need to figure out some coding for that. Once you do, create a new data set with three variables: wstatus, prior_status, and code, where code is the numerical code you want to assign to that particular transition, Save that new data set--let's call it transition_codes.dta. Then you can do this:

    Code:
    xtset person_id year
    gen prior_status = L1.wstatus
    merge m:1 wstatus prior_status using transition_codes, keep(master match)
    And if the variable prior_status, a mild concession towards reshaping wide, offends you, you can always drop it.

    In principle you could do this without even that mild concession. If, suppose, you only needed to encode change vs no change, you could do this:

    Code:
    by person_id (year), sort: gen byte changed = (wstatus != wstatus[_n-1])
    In theory, this code could be modified to a whole series of -replace change = ... if wstatus == ... & wstatus[_n-1] == ...- commands. But with 49 possible combinations of current and past, that approach does not scale up well, and it would be nearly impossible for a human being to write that code correctly.

    Comment


    • #3
      Thank you ! It works just fine! I guess I tried to approach it with a much more complicated way than it was necessary. Many combinations collapse to the same values, so I don't have to code all possible alternatives.

      If it's not too much trouble, I'd like to ask how could have I solved it if it didn't concern the transition between consecutive years. Let's say that I want to compute the value of variable trans at time t of person i so as transti = 100 if wstatus(t-3)i==1 & wstatus (t-1)i==2 . Is there any way that my above coding could be modified to solve this? Or, any other workaround?

      Really appreciate all the help

      Comment


      • #4
        Let's say that I want to compute the value of variable trans at time t of person i so as transti = 100 if wstatus(t-3)i==1 & wstatus (t-1)i==2 . Is there any way that my above coding could be modified to solve this?
        Add to the existing code:

        [code]
        by person_id (year): replace trans = 100 if wstatus[_n-3] == 1 & wstatus[_n-1] == 2
        [code]

        Comment

        Working...
        X