
No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • missing values row and "column"

    Hi folks,

    iIs there a way to ...
    1. egen num_nonmiss = rownonmiss(), strok - adding sth like ignore(, - : .)?
    2. not only show the last value for each ob but the variable it's stored in as well (and again with kind of ignore(, - : .)?
    (since egen lastvalue, rowlast() definitely falls "over for a mix of numeric and string variables", see Andrzej's thread:

    Thx a lot
    Last edited by Franz Gerbig; 20 Jan 2017, 08:48.
    Thank you for reading (and some reply)
    Using Stata 16.1
    Extractions (-dataex-) of the data I'm working with is impossible, sorry!

  • #2
    And what is it that you want ignored? A concrete data example (real or realistic) would help.


    • #3
      the chars specified: ignore(, - : .) - or would it be ignore("," ":" ".")?
      Thank you for reading (and some reply)
      Using Stata 16.1
      Extractions (-dataex-) of the data I'm working with is impossible, sorry!


      • #4
        Your punctuation was misinterpreted as an emoticon (smiley) before you edited it.

        The short answer is: write your own loop; and that is expanded on in a review of working row-wise.

        I assume that you are working with string variables only. This gives a template

        gen count = 0
        gen last = ""
        gen whichlast = ""
        foreach v of var stringvarlist {
             replace last = `v' if !inlist(`v',"",",",";",":")
             replace whichlast = "`v'" if !inlist(`v',"",",",";",":")
             replace count = count + !inlist(`v',"",",",";",":")
        Last edited by Nick Cox; 20 Jan 2017, 09:12.


        • #5
          Hi again,


          foreach var of varlist stringlist {
          replace `var' = "" if inlist(`var', " ", ".", ",", ";", ":", "-")

          I eliminated those actually invalid string values.

          So now, I'd like to identify the last valid value and variable - whether numeric or string (I need to deal with both).
          Do you have an idea how to get there?

          Thank you for reading (and some reply)
          Using Stata 16.1
          Extractions (-dataex-) of the data I'm working with is impossible, sorry!


          • #6
            Your criterion of interest is presumably now != ""; otherwise it's the same idea.


            • #7
              From time to time I go back to the issue and try to solve it ...

              //some completely senseless example data
              input id edu str20 name age str6 sex str6 city
              id    edu    name            age    sex            city
              1    2    "A. Doyle"        22    "male"        "Durban"
              2    4    "Mary Hope"        37    "female"    "Urban"
              3    7    "Guy Fawkes"    48    "male"        ""
              4    3    ","                69    "male"        ""
              5    5    "Mc Fly"        53    "-"            ""
              6    8    "Mil Grim"        70    ""            ""
              //starting the actual job I'm dealing with
              gen count = 0
              gen lastval = ""
              gen lastvar = ""
              ds id count lastval lastvar, not //store only real question variables into the macro
              global questions "`r(varlist)'"
              foreach v of varlist $questions {
                   replace lastval = "`v'" if !inlist(`v',"",",",";",":",".") & !missing(`v') //identify last given valid value for an observation
                   replace lastvar = "`v'" if !inlist(`v',"",",",";",":",".") & !missing(`v') //identify last variable containing that value for that observation
                   replace count = count + !inlist(`v',"",",",";",":",".") & !missing(`v')
              I get a type mismatch - for the combined if referring to str and numeric vars, I guess.
              Should I, do it for all strings and all numeric vars separately? But how do I know which one is the real lastvar (string a maybe the last str with a valid value, but numericvar b might show up later - and vice versa)?

              However, I wonder if I have to nest the loop above into a foreach x { ..., shouldn't I?

              Thank you very much for some light on all this!
              Thank you for reading (and some reply)
              Using Stata 16.1
              Extractions (-dataex-) of the data I'm working with is impossible, sorry!


              • #8
                Well, a simple
                tostring `v', replace
                as first element of the loop,
                quit the quotes of lastval = "`v'" (obtaining lastval = `v' only),
                quit the & !missing(`v')
                and a tiny correction in the brackets of the !inlist (one char was missing: "-")

                doe pretty well :
                foreach v of varlist $questions {
                     tostring `v', replace
                     replace lastval = `v' if !inlist(`v',"","-",",",";",":",".")
                     replace lastvar = "`v'" if !inlist(`v',"","-",",",";",":",".")
                     replace count = count + !inlist(`v',"","-",",",";",":",".")
                It had to be sth simple
                Thanks a lot!
                Last edited by Franz Gerbig; 24 Feb 2017, 08:35.
                Thank you for reading (and some reply)
                Using Stata 16.1
                Extractions (-dataex-) of the data I'm working with is impossible, sorry!

