Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • loop for missing values

    Hi everyone! I am trying to complete a database in form of a "tree", adding as values the last available observation when a new branch exists. This is, for level 1 I need all the observations ( I already got this), but for the subsequent levels I just need to complete those obvservations for which the next level is not missing. Eg: I should have all observations marked with X:
    id1 name1 cod1 id2 name2 cod2 id3 name3 cod3 id4 name4 cod4
    X1 Hotel1 879
    X X X X2 Hotel2 456
    X X X X X X X31 Hotel31 447
    X X X X X X x32 Hotel32 775
    X X X X X X X33 Hotel33 656
    X X X X X X X X X X4 Hotel4 894
    I encoded all variables. Since I have 8 levels, I got the correct result by using 6 loops like this (one for each level and running them from the 7th to 1)

    foreach var of varlist *7{
    replace `var'=`var'[_n-1] if id8!=.
    }

    and another for the first level:
    foreach var of varlist *1{
    replace `var'=`var'[_n-1] if `var'==.
    }


    What I am trying now is to simplify the code by unifying the 6 loops only into 1, but i can´t find out how. I have tried something like this but the result is not correct.

    foreach var of varlist id2-cod7{
    local numvar = substr("`var'", -1,.)
    rename `var' `var'_`numvar'
    replace `var'_`numvar' = `var'_`numvar'[_n-1] if `var'_`numvar'+1!=.
    }

    I would really appreciate if anyone could give a hint of how to proceed!

  • #2
    if you don't encode the variables I believe this gets you what you want,
    Code:
    ds
    foreach v in `r(varlist)' {
    replace `v' = `v'[_n-1] if !mi(`v'[_n-1]) & mi(`v'[_n])
    }

    Comment


    • #3
      Originally posted by Øyvind Snilsberg View Post
      if you don't encode the variables I believe this gets you what you want,
      Code:
      ds
      foreach v in `r(varlist)' {
      replace `v' = `v'[_n-1] if !mi(`v'[_n-1]) & mi(`v'[_n])
      }
      Thank you!! Clearly not enconding and using missing() it's a better option! However, this code completes more observation than requiered, since it assings values to all observations behind the first that is not missing in all levels. Maybe I didnt make myself clear enough in my post, but the idea is to fill in only until the last level avaible. In the example, if there was an additional line with only values until level 2, levels 3 and 4 should not be copied from above.

      Comment


      • #4
        I am a bit confused. If this was your data,
        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input str2 id1 str6 name1 str3 cod1 str2 id2 str6 name2 str3(cod2 id3) str7 name3 str3 cod3
        "X1" "Hotel1" "879" ""   ""       ""    ""    ""        ""  
        ""   ""       ""    "X2" "Hotel2" "456" ""    ""        ""  
        ""   ""       ""    ""   ""       ""    "X31" "Hotel31" "447"
        ""   ""       ""    ""   ""       ""    "X32" "Hotel32" "775"
        ""   ""       ""    ""   ""       ""    "X33" "Hotel33" "656"
        ""   ""       ""    ""   ""       ""    ""    ""        ""  
        ""   ""       ""    ""   ""       ""    ""    ""        ""  
        end
        what should the result look like?

        Perhaps,
        Code:
        ds
        foreach v in `r(varlist)' {
        replace `v' = `v'[_n-1] if !mi(`v'[_n-1]) & mi(`v'[_n]) & _n<[6]
        }
        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input str2 id1 str6 name1 str3 cod1 str2 id2 str6 name2 str3(cod2 id3) str7 name3 str3 cod3
        "X1" "Hotel1" "879" ""   ""       ""    ""    ""        ""   
        "X1" "Hotel1" "879" "X2" "Hotel2" "456" ""    ""        ""   
        "X1" "Hotel1" "879" "X2" "Hotel2" "456" "X31" "Hotel31" "447"
        "X1" "Hotel1" "879" "X2" "Hotel2" "456" "X32" "Hotel32" "775"
        "X1" "Hotel1" "879" "X2" "Hotel2" "456" "X33" "Hotel33" "656"
        ""   ""       ""    ""   ""       ""    ""    ""        ""   
        ""   ""       ""    ""   ""       ""    ""    ""        ""   
        end
        ?

        Comment


        • #5
          Originally posted by Øyvind Snilsberg View Post
          I am a bit confused. If this was your data,
          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input str2 id1 str6 name1 str3 cod1 str2 id2 str6 name2 str3(cod2 id3) str7 name3 str3 cod3
          "X1" "Hotel1" "879" "" "" "" "" "" ""
          "" "" "" "X2" "Hotel2" "456" "" "" ""
          "" "" "" "" "" "" "X31" "Hotel31" "447"
          "" "" "" "" "" "" "X32" "Hotel32" "775"
          "" "" "" "" "" "" "X33" "Hotel33" "656"
          "" "" "" "" "" "" "" "" ""
          "" "" "" "" "" "" "" "" ""
          end
          what should the result look like?

          Perhaps,
          Code:
          ds
          foreach v in `r(varlist)' {
          replace `v' = `v'[_n-1] if !mi(`v'[_n-1]) & mi(`v'[_n]) & _n<[6]
          }
          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input str2 id1 str6 name1 str3 cod1 str2 id2 str6 name2 str3(cod2 id3) str7 name3 str3 cod3
          "X1" "Hotel1" "879" "" "" "" "" "" ""
          "X1" "Hotel1" "879" "X2" "Hotel2" "456" "" "" ""
          "X1" "Hotel1" "879" "X2" "Hotel2" "456" "X31" "Hotel31" "447"
          "X1" "Hotel1" "879" "X2" "Hotel2" "456" "X32" "Hotel32" "775"
          "X1" "Hotel1" "879" "X2" "Hotel2" "456" "X33" "Hotel33" "656"
          "" "" "" "" "" "" "" "" ""
          "" "" "" "" "" "" "" "" ""
          end
          ?

          That output is correct! But in the case of having the following initial data:


          input str2 id1 str6 name1 str3 cod1 str2 id2 str6 name2 str3(cod2 id3) str7 name3 str3 cod3
          "X1" "Hotel1" "879" "" "" "" "" "" ""
          "X1" "Hotel1" "879" "X2" "Hotel2" "456" "" "" ""
          "X1" "Hotel1" "879" "X2" "Hotel2" "456" "X31" "Hotel31" "447"
          "X1" "Hotel1" "879" "X2" "Hotel2" "456" "X32" "Hotel32" "775"
          "X1" "Hotel1" "879" "X2" "Hotel2" "456" "X33" "Hotel33" "656"
          "X4" "Hotel4" "765"
          X X X "X5" "Hotel5" "534"
          ​​​​​​​X X X X X X "X6" "Hotel6" "621"
          "" "" "" "" "" "" "" "" ""

          The result needed is:

          input str2 id1 str6 name1 str3 cod1 str2 id2 str6 name2 str3(cod2 id3) str7 name3 str3 cod3
          "X1" "Hotel1" "879" "" "" "" "" "" ""
          "X1" "Hotel1" "879" "X2" "Hotel2" "456" "" "" ""
          "X1" "Hotel1" "879" "X2" "Hotel2" "456" "X31" "Hotel31" "447"
          "X1" "Hotel1" "879" "X2" "Hotel2" "456" "X32" "Hotel32" "775"
          "X1" "Hotel1" "879" "X2" "Hotel2" "456" "X33" "Hotel33" "656"
          "X4" "Hotel4" "765"
          "X4" "Hotel4" "765 "X5" "Hotel5" "534"
          ​​​​​​​"X4" "Hotel4" "765 "X5" "Hotel5" "534" "X6" "Hotel6" "621"
          "" "" "" "" "" "" "" "" ""

          Instead of all values completed with [_n-1]

          Comment

          Working...
          X