Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replacing rows of observations with values from subsequent rows

    Hi all,

    I have a situation where I have a panel setup with several observations for a given individual, and I need to condense this into one observation per individual by replacing values for given variables in subsequent rows into the first row for each individual. Where there are multiple values for different rows for a given person, I want to capture the last value.

    I am unable to share the actual dataset since sharing is not permitted, but I've reproduced a similar structure below. So for id 1, I would want to end up with just 1 row which has 124 for var1 and var2, and 216 for var3 and var4. Similarly, for id 2, I would want 206 for the first two variables and 210 for the next two, and for id3, 205 for the 1st and 207 for the rest. Is there any code that can achieve this?
    id var1 var2 var3 var4
    1 124 124 . .
    1 . 124 124 .
    1 . . 124 .
    1 . . 216 216
    2 206 206 . .
    2 . . 210 210
    2 . . 210 210
    3 205 205
    3 205 205 . .
    3 . 207 207 207
    3

  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte d int(var1 var2 var3 var4)
    1 124 124   .   .
    1   . 124 124   .
    1   .   . 124   .
    1   .   . 216 216
    2 206 206   .   .
    2   .   . 210 210
    2   .   . 210 210
    3 205 205   .   .
    3 205 205   .   .
    3   . 207 207 207
    3   .   .   .   .
    end
    
    collapse (lastnm) var* , by(d)
    
    list 
    
         +-------------------------------+
         | d   var1   var2   var3   var4 |
         |-------------------------------|
      1. | 1    124    124    216    216 |
      2. | 2    206    206    210    210 |
      3. | 3    205    207    207    207 |
         +-------------------------------+

    Comment


    • #3
      Thanks very much Nick, this worked well!

      Comment

      Working...
      X