Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • retrieve past obs in a data panel, based a flexible lag period

    Hi, StataListers

    I am facing the following challenge. Based on the following dataset

    Code:
    clear
    input byte(id date value lag) float valueLagged
    1 1 1 . .
    1 2 2 . .
    1 3 3 . .
    1 4 4 2 .
    2 1 1 . .
    2 2 2 . .
    2 3 3 . .
    2 4 4 3 .
    2 5 5 1 .
    end
    
    . xtset id date
    
    Panel variable: id (unbalanced)
     Time variable: date, 1 to 5
             Delta: 1 unit
    I need to retrieve valueLagged based on lag column value, considering the panel structure.
    Code:
    
    . list, sepby(id )
    
         +------------------------------------+
         | id   date   value   lag   valueL~d |
         |------------------------------------|
      1. |  1      1       1     .          . |
      2. |  1      2       2     .          . |
      3. |  1      3       3     .          . |
      4. |  1      4       4     2          2 |
         |------------------------------------|
      5. |  2      1       1     .          . |
      6. |  2      2       2     .          . |
      7. |  2      3       3     .          . |
      8. |  2      4       4     3          1 |
      9. |  2      5       5     1          4 |
         +------------------------------------+
    thanks in advance.

  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(id date value lag)
    1 1 1 .
    1 2 2 .
    1 3 3 .
    1 4 4 2
    2 1 1 .
    2 2 2 .
    2 3 3 .
    2 4 4 3
    2 5 5 1
    end
    
    xtset id date
    
    gen lag_date = date - lag
    by id (date): gen obs_no = _n
    rangestat (last) lag_obs = obs_no, by(id) interval(date lag_date lag_date)
    replace lag_obs = . if missing(lag_date)
    by id (date): gen wanted = value[lag_obs]
    
    list, sepby(id)
    Note: In your example data, the dates have no gaps, and they are also consecutive integers beginning at 1. Under those narrow conditions there is a simpler solution. But data sets with such simple values for date variables are uncommon, so I have written the code so that it requires neither of these assumptions to work correctly.

    -rangestat- is written by Robert Picard, Nick Cox, and Roberto Ferrer. It is available from SSC.

    Comment


    • #3
      Thanks Clyde, for considering wider conditions as in my actual database, gaps are not consecutive integers beginning at 1.

      Comment


      • #4
        You're welcome. I chose to write a more general solution than the example data needed because, in my experience, date variables are seldom as well behaved as what you showed. But, in general, it is best not to rely on responders to imagine difficulties that are not presented in example data. So everyone would be well advised, when posting example data, not to make up a simplified case, but rather to show example data that is representative of the real data, including its irregularities and oddball cases.

        Comment

        Working...
        X