Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating variable to track when change in another variable occurs

    Hello,

    I am trying to generate a variable (timecode) that will capture when a different variable (hgc) has reached above a certain value (in this case, 12), as represented by negative numbers in the time periods leading up to that time, 0 when that occurs, and positive numbers representing time periods after the event. For example, -3 -2 -1 0 1 2 3, with 0 representing the period when the variable hgc reaches 12.

    I am dealing with panel data, roughly 4000 individuals, with approximately 80,000 observations. For simplicity's sake, I am toying with only the individual's id, year, and highest grade completed (hgc), with timecode being the name of the new variable I wish to generate. I have (correctly, I hope) provided some sample data via dataex below.

    The issue I am struggling to overcome is that the time in which the variable hgc surpasses a certain value--in this case, 12--does not occur in the same year for every individual. This has proven too much for my limited stata skills and I am hoping for some guidance from a more experienced user.

    Best,
    Joseph



    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(id hgc) int year float timecode
    1 12 1981 .
    1 12 1982 .
    1 12 1983 .
    1 12 1984 .
    1 12 1985 .
    1 12 1986 .
    1 12 1987 .
    1 12 1988 .
    1 12 1989 .
    1 12 1990 .
    1 12 1991 .
    1 12 1992 .
    1 13 1996 .
    1 13 1998 .
    1 13 2000 .
    1 13 2002 .
    1 13 2004 .
    1 13 2006 .
    1 13 2008 .
    1 13 2010 .
    1 13 2012 .
    2 11 1981 .
    2 12 1982 .
    2 12 1983 .
    2 12 1984 .
    2 12 1985 .
    2 12 1986 .
    2 12 1987 .
    2 12 1988 .
    2 12 1989 .
    2 13 1990 .
    2 13 1991 .
    2 13 1992 .
    2 13 1996 .
    2 13 1998 .
    2 13 2000 .
    2 13 2002 .
    2 13 2004 .
    2 13 2006 .
    2 13 2008 .
    2 13 2010 .
    2 13 2012 .
    3 11 1981 .
    3 11 1982 .
    3 12 1983 .
    3 12 1984 .
    3 12 1985 .
    3 12 1986 .
    3 12 1987 .
    3 12 1988 .
    3 12 1989 .
    3 12 1990 .
    3 12 1991 .
    3 12 1992 .
    3 12 1996 .
    3 12 1998 .
    3 12 2000 .
    3 12 2002 .
    3 12 2004 .
    3 12 2006 .
    3 12 2008 .
    3 12 2010 .
    3 12 2012 .
    4 11 1981 .
    4 11 1982 .
    4 11 1983 .
    4 12 1984 .
    4 12 1985 .
    4 12 1986 .
    4 12 1987 .
    4 12 1988 .
    4 12 1989 .
    4 12 1990 .
    4 12 1991 .
    4 12 1992 .
    4 12 1996 .
    4 12 1998 .
    4 13 2000 .
    4 13 2002 .
    4 13 2004 .
    4 13 2006 .
    4 14 2008 .
    4 15 2010 .
    4 16 2012 .
    end
    Last edited by Joe Davis; 25 Apr 2017, 11:04.

  • #2
    Seems same question as this thread two days ago:

    http://www.statalist.org/forums/foru...y-introduction

    Code:
    egen when12 = min(year / (hgc >= 12)), by(id) 
    gen wanted = year - when12
    See the references given in the cited thread for more details.

    Comment


    • #3
      Nick,

      Thank you. It does seem that the two questions are similar. I will check out the post and cited thread you have referenced.

      Best,
      Joseph

      Comment


      • #4
        Fine, but I have suggested code for you too.

        Comment


        • #5
          The provided code seems to be a good start. However, I should have highlighted that this is an unbalanced panel data and, thus, I do not have year after year observations. After 1992, the results are incorrect, but I'll keep playing with it.

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input byte(id hgc) int year float(timecode when12 wanted)
          1 12 1981 . 1981  0
          1 12 1982 . 1981  1
          1 12 1983 . 1981  2
          1 12 1984 . 1981  3
          1 12 1985 . 1981  4
          1 12 1986 . 1981  5
          1 12 1987 . 1981  6
          1 12 1988 . 1981  7
          1 12 1989 . 1981  8
          1 12 1990 . 1981  9
          1 12 1991 . 1981 10
          1 12 1992 . 1981 11
          1 13 1996 . 1981 15
          1 13 1998 . 1981 17
          1 13 2000 . 1981 19
          1 13 2002 . 1981 21
          1 13 2004 . 1981 23
          1 13 2006 . 1981 25
          1 13 2008 . 1981 27
          1 13 2010 . 1981 29
          1 13 2012 . 1981 31
          2 11 1981 . 1982 -1
          2 12 1982 . 1982  0
          2 12 1983 . 1982  1
          2 12 1984 . 1982  2
          2 12 1985 . 1982  3
          2 12 1986 . 1982  4
          2 12 1987 . 1982  5
          2 12 1988 . 1982  6
          2 12 1989 . 1982  7
          2 13 1990 . 1982  8
          2 13 1991 . 1982  9
          2 13 1992 . 1982 10
          2 13 1996 . 1982 14
          2 13 1998 . 1982 16
          2 13 2000 . 1982 18
          2 13 2002 . 1982 20
          2 13 2004 . 1982 22
          2 13 2006 . 1982 24
          2 13 2008 . 1982 26
          2 13 2010 . 1982 28
          2 13 2012 . 1982 30
          3 11 1981 . 1983 -2
          3 11 1982 . 1983 -1
          3 12 1983 . 1983  0
          3 12 1984 . 1983  1
          3 12 1985 . 1983  2
          3 12 1986 . 1983  3
          3 12 1987 . 1983  4
          3 12 1988 . 1983  5
          3 12 1989 . 1983  6
          3 12 1990 . 1983  7
          3 12 1991 . 1983  8
          3 12 1992 . 1983  9
          3 12 1996 . 1983 13
          3 12 1998 . 1983 15
          3 12 2000 . 1983 17
          3 12 2002 . 1983 19
          3 12 2004 . 1983 21
          3 12 2006 . 1983 23
          3 12 2008 . 1983 25
          3 12 2010 . 1983 27
          3 12 2012 . 1983 29
          4 11 1981 . 1984 -3
          4 11 1982 . 1984 -2
          4 11 1983 . 1984 -1
          4 12 1984 . 1984  0
          4 12 1985 . 1984  1
          4 12 1986 . 1984  2
          4 12 1987 . 1984  3
          4 12 1988 . 1984  4
          4 12 1989 . 1984  5
          4 12 1990 . 1984  6
          4 12 1991 . 1984  7
          4 12 1992 . 1984  8
          4 12 1996 . 1984 12
          4 12 1998 . 1984 14
          4 13 2000 . 1984 16
          4 13 2002 . 1984 18
          4 13 2004 . 1984 20
          4 13 2006 . 1984 22
          4 14 2008 . 1984 24
          4 15 2010 . 1984 26
          4 16 2012 . 1984 28
          end

          Comment


          • #6
            Sorry, what's incorrect about that?

            The subtraction year - when12 makes no assumptions whatsoever about equal or unequal spacing of years.

            If you want to count observations before during and after first observance of 12, then that's a puzzling thing to want, but it is different code. I didn't guess at that from #1.

            Comment


            • #7
              Here's some code for counting back and forth from year zero.

              Code:
              bysort id (year) : gen WANTED = _n if year == when12 
              bysort id (WANTED): replace WANTED = WANTED[1]  
              bysort id (year): replace WANTED = _n - WANTED

              Comment

              Working...
              X