Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to extract variable name from assert error message if variable not found

    Dear statalist members,

    when executing a conditional assert in Stata, if a variable in the condition is not present in the data set, the assert fails with the message "variable not found," followed by error code 111. For a larger project, I need to directly process the name of the missing variable, but I can't find the macro name.

    Code:
    . sysuse auto
    (1978 automobile data)
    
    . assert price < 5000 if missingvariable ==1
    missingvariable not found
    r(111);
    
    . return list
    
    macros:
                     r(fn) : "C:\Program Files\Stata19\ado\base/a/auto.dta"
    
    . ereturn list
    
    . sreturn list
    Is it possible that Stata doesn't store this variable name anywhere? Kind regards and thank you in advance, Benno Schönberger

  • #2
    Error messages are typically not stored, so you might have to log the output and parse the resulting log-file.

    Comment


    • #3
      This can be complicated by how complex the conditions are structured, but you can work around this by using -confirm- (specifically a flavour of -confirm variable-). This lets you confirm the existence of a variable, but the onus is on you to parse out the name of the variable to pass into -confirm-.

      Comment


      • #4
        I don't have a better or different answer given those from Andrew Musau or Leonardo Guizzetti as my understanding matches theirs.

        But I have a question: what you would do with this information if it were available? My experience is that code fails with an error and I fix it, very quickly or very slowly as the case may be. Either way. I see the error message mentioning the supposed variable and move from there.

        Comment


        • #5
          Dear Nick Cox ,

          good question, and the answer isn't particularly easy to give without going into further detail. Therefore, please excuse the length of my answer.

          I work at a large educational research institute and process a large number of different survey datasets, more or less semi-automatically. Each survey is generated in a metadata system that allows us to reuse various items over several years and also to define extremely complex filters in advance, determining which respondents should or should not receive certain questions. We ask people about their entire employment, education, and relationship history among a lot of various other topics and to keep the survey as short as possible, many questions are simply filtered out if certain initial criteria are not met. For example, a person is not asked any further questions about relationships if they indicate that they are single. However, these defined filters are not fully machine-readable, but rather intended as instructions that the survey institute should program into their survey software.

          This filter programming has so far been manually reviewed by staff, and I am now trying to implement it fully- or at least semi-automatically with an ado. To do this, I read these filter conditions from the database, transform them into Stata-compliant code, and test the result in several places.

          One of the final tests is the actual application to the specific dataset provided by the survey institute. For each variable in the dataset for which an input filter was defined, the values ​​of the variable are checked. In some cases, variables may be defined in the filter that don't actually exist in the current dataset because they are part of a preloaded dataset that I may not have received or that was incorrectly not provided.Since I process a large number of data sets consecutively, I'd like to display a brief message for each dataset indicating how many of the defined filters are correct, how many are incorrect, and how many couldn't be tested at all because defined variables are missing. I can't manually intervene every time a filter check fails due to a missing variable.

          If I could temporarily store the variable name from the `assert` error message, I could generate a summary of missing variables after processing one or all of the data sets.

          As it stands, I'll probably have to add an extra step to extract all the strings from the filter conditions that could be variable names and check for their presence. I was hoping to avoid this step.

          Best regards,
          Benno Schönberger

          Comment


          • #6
            Originally posted by Andrew Musau View Post
            Error messages are typically not stored, so you might have to log the output and parse the resulting log-file.
            Many thanks to Andrew; I hadn't considered this solution and would be interested to know how it could be implemented. Should the entire output be written to a single log file, or should a new temporary log file be created for each assertion, and then parsed for a specific variable name only when certain error codes occur? Unfortunately, I haven't done anything like this before and wouldn't know how to implement it. I would be open to any suggestions...

            Thank you too Leonardo Guizzetti.
            I could at least implement your suggested solution with my current Stata knowledge, but I wanted to avoid it if possible. But now it seems I have no other choice.

            Comment


            • #7
              Code:
              sysuse auto , clear
              
              capture program drop assert2
              program define assert2, rclass
                  version 19
              
                  local cmdline `"`0'"'
              
                  capture noisily assert `cmdline'
                  local rc = _rc
              
                  if `rc' == 111 {
                      local rest `"`cmdline'"'
              
                      while `"`rest'"' != "" {
                          gettoken tok rest : rest, parse(" =+-*/^<>()[]{}.,:&|!~")
              
                          if regexm("`tok'", "^[A-Za-z_][A-Za-z0-9_]*$") {
                              if !inlist("`tok'", "if", "in", "missing", "mi", "abs", "log", "exp") {
                                  capture confirm variable `tok'
                                  if _rc {
                                      return local missing "`tok'"
                                      di as error "missing variable: `tok'"
                                      exit 111
                                  }
                              }
                          }
                      }
              
                      exit 111
                  }
              
                  if `rc' {
                      exit `rc'
                  }
              end
              
              assert2 price < 5000 if missingvariable == 1

              Comment


              • #8
                Originally posted by Benno Schoenberger View Post
                Many thanks to Andrew; I hadn't considered this solution and would be interested to know how it could be implemented
                Here's a quick draft:
                Code:
                program bennos_assert , rclass
                    
                    version 16.1
                    
                    capture noisily assert `macval(0)'
                    if (_rc != 111) ///
                        exit _rc
                    
                    tempfile tmpfile
                    tempname tmpname
                    
                    quietly {
                        
                        log using "`tmpfile'" , text name(`tmpname')
                        capture noisily assert `macval(0)'
                        log close `tmpname'
                    
                    }
                    
                    assert ustrregexm(fileread("`tmpfile'"),"^(.+) not found$")
                    
                    return local notfound = ustrregexs(1)
                    
                    exit 111
                    
                end
                The downside to this approach is that you cannot run it quietly because Stata logs only what's printed to the screen.
                Last edited by daniel klein; 26 Apr 2026, 02:07.

                Comment


                • #9
                  Thanks a lot to George Ford ,
                  I did have to add inlist and inrange to the list of excluded strings, but now it works perfectly. Thank you so much!

                  Comment

                  Working...
                  X