Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a variable for a "life event" in panel data which occurred in different waves for different individuals

    Hi there,
    I am using the BHPS and testing how quickly respondents' happiness measures revert back to original levels after winning a lottery.

    I have panel data in the following format:
    pid wave lotwin happiness
    30 1 0 4
    30 2 0 4
    30 3 0 4
    30 4 1 6
    30 5 0 5
    30 6 0 4
    30 7 0 4
    31 1 0 4
    31 2 0 4
    31 3 1 7
    31 4 0 7
    31 5 0 6
    where pid is the unique identifier, happiness is on a scale of 1-12 and lotwin is a binary variable with 1 if you won the lottery and 0 if you did not.
    Each wave corresponds to a different year (e.g. wave 1 = 1996, wave 2 = 1997, etc).

    I am trying to graph respondents' happiness scores in times T-2, T-1, T, T+1, T+2 where T is the year the individual won the lottery, and calculate the ratio happiness(T+2)/happiness(T-2) for individuals to classify them as "fast to adapt" and "slow to adapt" given an arbitrary ratio cut off point.

    As individuals won the lottery in different waves, I am struggling to work out how to create the variable T and then how to use this as a reference point?

    Please could you be of assistance. Thank you in advance for your help!

  • #2
    Welcome to Statalist. Please take the time to read through the FAQs and familiarize yourself with presenting data examples using the dataex command. For graphing purposes, there is no difficulty defining different periods per lottery winner but do note that in terms of statistical analysis, happiness may vary between periods (e.g., people may generally be less happy during recessions), and you need to take into account time effects.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(pid wave lotwin happiness)
    30 1 0 4
    30 2 0 4
    30 3 0 4
    30 4 1 6
    30 5 0 5
    30 6 0 4
    30 7 0 4
    31 1 0 4
    31 2 0 4
    31 3 1 7
    31 4 0 7
    31 5 0 6
    end
    
    bys pid wave: gen time = 0 if lotwin== 1
    bys pid (wave): replace time = time[_n-1]+1 if missing(time)
    bys pid (wave): replace time = time[_n+1]-1 if missing(time)
    bys pid (wave): replace time = time[_n+1]-1 if missing(time)
    label define time -2 "T-2" -1 "T-1" 0 "T" 1 "T+1" 2 "T+2"
    label values time time
    set scheme s1color
    line happiness time if inrange(time, -2, 2), by(pid, note("")) xlabel(,valuelabel)
    Click image for larger version

Name:	Graph.png
Views:	1
Size:	24.1 KB
ID:	1477994

    Last edited by Andrew Musau; 09 Jan 2019, 12:32.

    Comment


    • #3
      EDITED: This crossed with Andrew Musau

      Code:
      gen year = 1994 + wave
      sort pid lotwin  // this puts the year they won the lottery last (for a given pid)
      bysort pid (lotwin): gen win_year = year[_N]  
      sort pid wave
      gen win_time = year - win_year  // creating the -2, -1, 0, 1, 2, etc
      
      . list, noobs sepby(pid) abbrev(14)
      
        +--------------------------------------------------------------+
        | pid   wave   year   lotwin   happiness   win_year   win_time |
        |--------------------------------------------------------------|
        |  30      1   1995        0           4       1998         -3 |
        |  30      2   1996        0           4       1998         -2 |
        |  30      3   1997        0           4       1998         -1 |
        |  30      4   1998        1           6       1998          0 |
        |  30      5   1999        0           5       1998          1 |
        |  30      6   2000        0           4       1998          2 |
        |  30      7   2001        0           4       1998          3 |
        |--------------------------------------------------------------|
        |  31      1   1995        0           4       1997         -2 |
        |  31      2   1996        0           4       1997         -1 |
        |  31      3   1997        1           7       1997          0 |
        |  31      4   1998        0           7       1997          1 |
        |  31      5   1999        0           6       1997          2 |
        +--------------------------------------------------------------+
      
      * I never do graphs in Stata, so these will be pretty basic
      line happiness win_time if inrange(win_time, -2, 2), ylabel(0(1)7)
      twoway connected happiness win_time if inrange(win_time, -2, 2), ylabel(0(1)7) by(pid)
      Last edited by David Benson; 09 Jan 2019, 12:42.

      Comment


      • #4
        Thank you so much Andrew Musau David Benson! This is so helpful.

        If instead lotwin was not a binary variable but rather categorical (2=winnings of 1K+, 1=winnings of <1K and 0=never won the lottery), how would you incorporate this into your code?

        If I wanted to look at the change in happiness as opposed to raw happiness values in T-2, T-1, etc, would you have to generate a new variable for the change in happiness? (such that you could achieve a bar graph with x-axis = 3 bars for each category of lotwin and y-axis = mean change in GHQ (e.g. from T-2 to T+2))

        Comment


        • #5
          1) To handle if lotwin was not a binary variable but rather categorical (2=winnings of 1K+, 1=winnings of <1K and 0=never won the lottery):

          I would created an indicator var d_lotwin = (lotwin==1 | lotwin==2) and then just run the rest of my code. Or create indicator variables for lotwin==1, lotwin==2 (because you will want to see if the effect on happiness is different for > $1k vs <$1k
          Code:
          d_lotwin = (lotwin==1 | lotwin==2) 
          tabulate lotwin, gen(win)

          2) To create the change from the year before:
          Code:
          bysort pid (wave): gen diff_happ = happiness - happiness[_n-1]  // change in number
          bysort pid (wave): gen diff_pct = (happiness - happiness[_n-1]) / happiness[_n-1]   // percent change
          * I don't know if "percent change in happiness score" is meaningful here, I just provide to show how you could do it if needed
          format diff_pct %9.3f
          
          . list, noobs sepby(pid) abbrev(12)
          
            +--------------------------------------------------------+
            | pid   wave   lotwin   happiness   diff_happ   diff_pct |
            |--------------------------------------------------------|
            |  30      1        0           4           .          . |
            |  30      2        0           4           0      0.000 |
            |  30      3        0           4           0      0.000 |
            |  30      4        1           6           2      0.500 |
            |  30      5        0           5          -1     -0.167 |
            |  30      6        0           4          -1     -0.200 |
            |  30      7        0           4           0      0.000 |
            |--------------------------------------------------------|
            |  31      1        0           4           .          . |
            |  31      2        0           4           0      0.000 |
            |  31      3        1           7           3      0.750 |
            |  31      4        0           7           0      0.000 |
            |  31      5        0           6          -1     -0.143 |
            +--------------------------------------------------------+

          Comment

          Working...
          X