Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem making multiple lagged and lead variables for unbalanced panel data

    I am having trouble coding for up to 5 lagged and lead variables, the code I am using doesn't accurately take into account missing waves.
    This is being used on stata 15.0.

    I am using panel data from the british household panel survey and the meaning for my variables are:
    pid =ID number of the individual, wave = time, M0 = if an individual migrated in that wave.

    The code that i Used was:
    sort pid wave
    gen lag1 = M0[_n-1] if wave==wave[_n-1]+1
    gen lag2 = M0[_n-2] if wave==wave[_n-1]+1
    gen lag3 = M0[_n-3] if wave==wave[_n-1]+1
    gen lag4 = M0[_n-4] if wave==wave[_n-1]+1
    gen lag5 = M0[_n-5] if wave==wave[_n-1]+1

    gen lead1 = M0[_n+1] if wave==wave[_n-1]+1
    gen lead2 = M0[_n+2] if wave==wave[_n-1]+1
    gen lead3 = M0[_n+3] if wave==wave[_n-1]+1
    gen lead4 = M0[_n+4] if wave==wave[_n-1]+1
    gen lead5 = M0[_n+5] if wave==wave[_n-1]+1
    gen lead6 = M0[_n+6] if wave==wave[_n-1]+1

    I was wondering if there is a way to take into account unbalanced panel and data for lagged and lead variables, i though "if wave==wave[_n-1]+1" would solve for this issue?

    this was the result that I got:
    Click image for larger version

Name:	Yes.png
Views:	2
Size:	24.6 KB
ID:	1438424



    It doesn't accurately take into account the missing wave ( wave 11, line 8->9),
    thank you for your time

  • #2
    This is a sequel to https://www.statalist.org/forums/for...ced-panel-data

    It's best not to start new threads unless the question is quite different.

    You still haven't provided example data in the form requested. I see an image and I can't copy and paste from it. You are asked to use dataex in FAQ Advice #12.

    With irregularly spaced data, there are at least two ways of defining previous values. This example based on yours shows some technique and incidentally how data examples can be given using dataex.

    Code:
     
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(pid wave)
     7857  8
     7857  9
     7857 10
     7857 11
    14578  6
    14578  8
    14578  9
    14578 10
    14578 12
    14578 13
    14578 14
    14578 15
    14578 16
    14578 17
    14578 18
    end
    Code:
    gen y = _n
    
    * use previous wave recorded, ignoring gaps 
    forval lag = 1/5 {
        bysort pid (wave) : gen y`lag' = y[_n-`lag']
    }
    
    list, sepby(pid)
    
         +--------------------------------------------+
         |   pid   wave    y   y1   y2   y3   y4   y5 |
         |--------------------------------------------|
      1. |  7857      8    1    .    .    .    .    . |
      2. |  7857      9    2    1    .    .    .    . |
      3. |  7857     10    3    2    1    .    .    . |
      4. |  7857     11    4    3    2    1    .    . |
         |--------------------------------------------|
      5. | 14578      6    5    .    .    .    .    . |
      6. | 14578      8    6    5    .    .    .    . |
      7. | 14578      9    7    6    5    .    .    . |
      8. | 14578     10    8    7    6    5    .    . |
      9. | 14578     12    9    8    7    6    5    . |
     10. | 14578     13   10    9    8    7    6    5 |
     11. | 14578     14   11   10    9    8    7    6 |
     12. | 14578     15   12   11   10    9    8    7 |
     13. | 14578     16   13   12   11   10    9    8 |
     14. | 14578     17   14   13   12   11   10    9 |
     15. | 14578     18   15   14   13   12   11   10 |
         +--------------------------------------------+
    
    * use wave number literally
    
    tsset pid wave
           panel variable:  pid (unbalanced)
            time variable:  wave, 6 to 18, but with gaps
                    delta:  1 unit
    
    forval LAG = 1/5 {
        gen Y`LAG' = L`LAG'.y
    }
    
    list pid wave y Y*, sepby(pid)
    
         +--------------------------------------------+
         |   pid   wave    y   Y1   Y2   Y3   Y4   Y5 |
         |--------------------------------------------|
      1. |  7857      8    1    .    .    .    .    . |
      2. |  7857      9    2    1    .    .    .    . |
      3. |  7857     10    3    2    1    .    .    . |
      4. |  7857     11    4    3    2    1    .    . |
         |--------------------------------------------|
      5. | 14578      6    5    .    .    .    .    . |
      6. | 14578      8    6    .    5    .    .    . |
      7. | 14578      9    7    6    .    5    .    . |
      8. | 14578     10    8    7    6    .    5    . |
      9. | 14578     12    9    .    8    7    6    . |
     10. | 14578     13   10    9    .    8    7    6 |
     11. | 14578     14   11   10    9    .    8    7 |
     12. | 14578     15   12   11   10    9    .    8 |
     13. | 14578     16   13   12   11   10    9    . |
     14. | 14578     17   14   13   12   11   10    9 |
     15. | 14578     18   15   14   13   12   11   10 |
         +--------------------------------------------+
    
    


    Comment


    • #3
      Thank you, I applied the second version and it gave the results that I wanted.
      How would I be able to make 5 lead variables that take account of unbalanced data?

      Comment


      • #4
        -help tsvarlist-

        Comment

        Working...
        X