Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replace missing values in a variable by non-missing values in other variable in order of priority

    I have data in the following format.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int(x1 x2 x3 x4 x5 )
    1 . 3 4 2
    2 . . . . 
    . 2 . 3 1  
    . . . . . 
    . . . 4 3 
    . 5 2 . .
    . . 1 . 5
    . 2 . 3 .
    end
    How can I easily generate a new variable, say x6 that is equal to x1, but replacing missing values of x1 with non missing values of x4 and if x4 is missing, replace by x3, if x3 is missing replace by x2, if x2 is missing replace by x5. If all x1 x2 x3 x4 x5 are missing, x6 remains missing.
    Thanks

  • #2
    Luke:
    see below a basic idea:
    Code:
    . clonevar x6=x1
    (6 missing values generated)
    
    . replace x6=x4 if missing(x6)
    (3 real changes made)
    
    . replace x6=x3 if missing(x6)
    (2 real changes made)
    
    . replace x6=x2 if missing(x6)
    (0 real changes made)
    
    . list
    
         +-----------------------------+
         | x1   x2   x3   x4   x5   x6 |
         |-----------------------------|
      1. |  1    .    3    4    2    1 |
      2. |  2    .    .    .    .    2 |
      3. |  .    2    .    3    1    3 |
      4. |  .    .    .    .    .    . |
      5. |  .    .    .    4    3    4 |
         |-----------------------------|
      6. |  .    5    2    .    .    2 |
      7. |  .    .    1    .    5    1 |
      8. |  .    2    .    3    .    3 |
         +-----------------------------+
    
    .
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Code:
       gen x6 = cond(x1 < ., x1, cond(x4 < ., x4, cond(x3 < .,  x3, cond(x2 < ., x2, x5))))
      See https://www.stata-journal.com/sjpdf....iclenum=pr0016 for much more discussion.

      Comment


      • #4
        Tongue in cheek: Nick's code is a bit more efficient than my previous trivial attempt!
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          To be fair, some people hate code like #3 and I know why. The link says more. But you have to learn not only to write it but how to read it, say as

          Code:
          gen x6 = cond(x1 < ., x1, cond(x4 < ., x4, cond(x3 < .,  x3, cond(x2 < ., x2, x5))))
          IF x1 is not missing RETURN x1

          ELSE IF x4 is not missing RETURN x4

          and so on.

          Comment


          • #6
            Thank you Carlo!

            Comment


            • #7
              Thank you very much Nick.

              Comment

              Working...
              X