Dear statalist members,
after half an hour of unsuccessfully scrolling through the results of various searches in the forum, I have come to the conclusion that my problem is apparently not so common as to have been discussed here too often, and I am turning to you.
I'm having trouble with seemingly hidden or invisible elements within strings. As part of a larger project for automatically checking filter conditions within survey datasets, I'm extracting more or less human-readable filter conditions from a database using ODBC and processing them further in Stata. In doing so, these conditions are sometimes broken down into multiple individual conditions, and elements like from ... to are recoded to inrange() or var = 1 & 2 & 3 & 4 to inlist(var,1,2,3,4), etc.
One of the final steps is to test which error messages Stata outputs when these filter conditions are executed, for example, as count if `filter'
There are some filters that appear to be Stata-compliant (Stata functions are correct, all parentheses and quotation marks are complete, etc.), but still return error code 133.
After exporting a data sample via Dataex for this post, I saw what was probably the crucial clue to the actual problem. This is the unmodified Dataex output:
As you can clearly see here, Dataex also seems unable to handle my problem correctly, as some kind of line break is also being output for observation 2. These hidden line breaks are apparently also the problem that leads to a Stata error code 133 when they are part of a filter condition.
In short: is there a simple and quick way to identify and delete these hidden elements within strings?
As the example above reveals, I am unfortunately unable to generate a comprehensible and executable data example for you, so I hope that someone will still take pity on me and help me out.
Kind regards,
Benno Schönberger
after half an hour of unsuccessfully scrolling through the results of various searches in the forum, I have come to the conclusion that my problem is apparently not so common as to have been discussed here too often, and I am turning to you.
I'm having trouble with seemingly hidden or invisible elements within strings. As part of a larger project for automatically checking filter conditions within survey datasets, I'm extracting more or less human-readable filter conditions from a database using ODBC and processing them further in Stata. In doing so, these conditions are sometimes broken down into multiple individual conditions, and elements like from ... to are recoded to inrange() or var = 1 & 2 & 3 & 4 to inlist(var,1,2,3,4), etc.
One of the final steps is to test which error messages Stata outputs when these filter conditions are executed, for example, as count if `filter'
There are some filters that appear to be Stata-compliant (Stata functions are correct, all parentheses and quotation marks are complete, etc.), but still return error code 133.
After exporting a data sample via Dataex for this post, I saw what was probably the crucial clue to the actual problem. This is the unmodified Dataex output:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input float id str697 filter float errorcode 1 "inlist(td00352,1)|inlist(td00353,2)" 111 2 "inlist(td00353,1) " 133 end
In short: is there a simple and quick way to identify and delete these hidden elements within strings?
As the example above reveals, I am unfortunately unable to generate a comprehensible and executable data example for you, so I hope that someone will still take pity on me and help me out.
Kind regards,
Benno Schönberger

Comment