Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Shift non-missing values to the left in an observation

    Hi,
    I cant figure out on how to shift data to the left so that missings are removed. My problem is that the number places that have to be shiftet are individual per observation.
    In my real data, I have 70 days where observations are missing at the beginning, and for each day I have 4 variables which need to be shiftet, but I hope the solution for the simple case can be generalized.

    A simple example data is this:
    Code:
    clear
    input id day1 day2 day3 day4
    
    id day1 day2 day3 day4
    1 5 4 . 2
    2 . 1 2 5
    3 . . . 4
    4 . . 3 4
    end
    In this example, for id 2, I need to shift data by one, while for id 3 it has to be shiftet by 3 and sid 4 by two, so that I end up with this:
    Code:
    clear
    input id day1 day2 day3 day4
    
    id day1 day2 day3 day4
    1 5 4 . 2
    2 1 2 5 .
    3 4 . . .
    4 3 4 . .
    end
    I have tried to work with both loops and reshaping, but I cant get the individual number of shifts right.
    Does anyone have a solution on this?
    I am relatively new at STATA, so please bear over with me if this is a very easy problem :-)

    Thanks!
    -Anna

  • #2
    I'm thinking that you actually intend your first line to be
    Code:
    1 5 4 2 .
    On that assumption, here's a brute force approach that seems to work. There's doubtless a more efficient way to do this, but this should be fast enough.
    Code:
    // Create bigger bank of test data than you gave.
    clear
    local N = 20
    local nday = 10
    set obs `N'
    forval i = 1/`nday' {
      gen byte day`i' = ceil(runiform() * 5) if (runiform() > 0.4)
    }  
    list
    // Real stuff starts here.
    qui gen byte imiss = .
    forval  j = 1/`nday'  {  
        local ndaym1 = `nday' - 1
        forval i = 1/`ndaym1' {
           qui replace imiss = missing(day`i')
           local next = `i' + 1
           qui replace day`i' = day`next' if imiss
           qui replace day`next' = . if imiss
        }
    }    
    list

    Comment


    • #3
      Here is another approach based on reshape. I am making the same assumption as Mike about observation #1.

      You could reshape back. But the reason the problem arises in the first place is that this is an awkward data layout. For most Stata purposes, you're better off long.

      Code:
      clear
      input id day1 day2 day3 day4
      id day1 day2 day3 day4
      1 5 4 . 2
      2 . 1 2 5
      3 . . . 4
      4 . . 3 4
      end
      
      reshape long day , i(id) j(newday) 
      rename day whatever 
      bysort id (newday) : gen day = sum(whatever < .) if whatever < . 
      bysort id (day) : replace day = _n 
      drop newday  
      
      list, sepby(id) 
      
           +---------------------+
           | id   whatever   day |
           |---------------------|
        1. |  1          5     1 |
        2. |  1          4     2 |
        3. |  1          2     3 |
        4. |  1          .     4 |
           |---------------------|
        5. |  2          1     1 |
        6. |  2          2     2 |
        7. |  2          5     3 |
        8. |  2          .     4 |
           |---------------------|
        9. |  3          4     1 |
       10. |  3          .     2 |
       11. |  3          .     3 |
       12. |  3          .     4 |
           |---------------------|
       13. |  4          3     1 |
       14. |  4          4     2 |
       15. |  4          .     3 |
       16. |  4          .     4 |
           +---------------------+
      If you are determined to have your wide layout, you can get it with

      Code:
      reshape wide whatever, i(id) j(day)
      rename (whatever*) (day=)

      Comment


      • #4
        Originally posted by Mike Lacy View Post
        I'm thinking that you actually intend your first line to be
        Code:
        1 5 4 2 .
        On that assumption, here's a brute force approach that seems to work. There's doubtless a more efficient way to do this, but this should be fast enough.
        Actually, I want to keep missings in the "middle" of the data, only the missings to the left must be removed. But I think I somehow can figure out how to work around that.

        Thanks a lot for both solutions! I will go with the first one since I'm more familiar with loops.

        Comment


        • #5
          Here's my code adapted for that different problem:

          Code:
          clear
          input id day1 day2 day3 day4
          1 5 4 . 2
          2 . 1 2 5
          3 . . . 4
          4 . . 3 4
          end
          
          reshape long day , i(id) j(newday) 
          rename day whatever 
          bysort id (newday) : gen day = sum(whatever < .) 
          
          list 
          
          replace day = . if day == 0 
          bysort id (day newday) : replace day = _n 
          drop newday  
          
          list, sepby(id) 
          
               +---------------------+
               | id   whatever   day |
               |---------------------|
            1. |  1          5     1 |
            2. |  1          4     2 |
            3. |  1          .     3 |
            4. |  1          2     4 |
               |---------------------|
            5. |  2          1     1 |
            6. |  2          2     2 |
            7. |  2          5     3 |
            8. |  2          .     4 |
               |---------------------|
            9. |  3          4     1 |
           10. |  3          .     2 |
           11. |  3          .     3 |
           12. |  3          .     4 |
               |---------------------|
           13. |  4          3     1 |
           14. |  4          4     2 |
           15. |  4          .     3 |
           16. |  4          .     4 |
               +---------------------+
          The reshape back to wide (not recommended) is the same code as before.
          Last edited by Nick Cox; 09 Mar 2017, 02:18.

          Comment

          Working...
          X