Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fill column based on simple rules

    SAMPLE DATA:
    Code:
    input person time var1
     1 1 1
     1 2 2
     1 3 .
     1 4 3
     1 5 1
     1 6 4
     2 1 2
     2 2 2
     2 3 .
     2 4 .
     2 5 3
     2 6 3
     3 1 3
     3 2 4
     3 3 .
     3 4 .
     3 5 .
     3 6 .
    end
    Rules for var1:

    if var1 is missing, fill with previous value
    if var1 at time t is 3, all future var1 values are set to 3 unless there is a 4, which stays a 4
    if var1 is missing after a 4, remove that row

    DESIRED OUTPUT:

    Code:
    input person time var1
     1 1 1
     1 2 2
     1 3 2
     1 4 3
     1 5 3
     1 6 4
     2 1 2
     2 2 2
     2 3 2
     2 4 2
     2 5 3
     2 6 3
     3 1 3
     3 2 4
    end
    Last edited by sladmin; 17 Oct 2019, 08:17. Reason: anonymize original poster

  • #2
    In Stata what you call columns are called variables.

    This is really dangerous territory without a time variable. Please revise your example to show the time variable you are using. (If you don't have one, why not? There is no scope for reliably identifying previous values without one, as a sort for other reasons destroys the existing order.)

    Comment


    • #3
      Originally posted by Nick Cox View Post
      In Stata what you call columns are called variables.

      This is really dangerous territory without a time variable. Please revise your example to show the time variable you are using. (If you don't have one, why not? There is no scope for reliably identifying previous values without one, as a sort for other reasons destroys the existing order.)
      Nick Cox Thank you for this feedback. I edited the question to reflect the time variable.

      Comment


      • #4

        replace dummy=dummy[_n-1] if dummy==.

        Comment


        • #5
          Thanks for the update. Note that #3 is dangerous because it does not respect the panel structure (which although not explicit at all, I imagine to be what you want).

          Does this help?

          Code:
          clear 
          
          input person time var1
           1 1 1
           1 2 2
           1 3 .
           1 4 3
           1 5 1
           1 6 4
           2 1 2
           2 2 2
           2 3 .
           2 4 .
           2 5 3
           2 6 3
           3 1 3
           3 2 4
           3 3 .
           3 4 .
           3 5 .
           3 6 .
          end
          
          tsset person time 
          gen previous = . 
          replace previous = cond(!missing(L.var1), L.var1, L.previous) 
          
          drop if missing(var1) & previous == 4 
          
          replace var1 = previous if missing(var1) 
          
          list, sepby(person)
          
               +---------------------------------+
               | person   time   var1   previous |
               |---------------------------------|
            1. |      1      1      1          . |
            2. |      1      2      2          1 |
            3. |      1      3      2          2 |
            4. |      1      4      3          2 |
            5. |      1      5      1          3 |
            6. |      1      6      4          1 |
               |---------------------------------|
            7. |      2      1      2          . |
            8. |      2      2      2          2 |
            9. |      2      3      2          2 |
           10. |      2      4      2          2 |
           11. |      2      5      3          2 |
           12. |      2      6      3          3 |
               |---------------------------------|
           13. |      3      1      3          . |
           14. |      3      2      4          3 |
               +---------------------------------+

          Comment

          Working...
          X