I want to use a regex like (^|\s)1001(\s|$)
in a Firebird similar to expression:
Examples:
abc 1001 abc
- trueabc 121001 abc
- false1001 abc
- true121001
- falseabc 1001
- trueI try'd to convert it to a regex in Firebird:
Where COLUMN similar to (^|[:WHITESPACE:])abc 1001 abc($|[:WHITESPACE:])
, but ^
(start of line) and $
(end of line) is not working and the query end with:
Invalid SIMILAR TO pattern Exception.
I can not find anything about start and end of line in the Firebird Doc's at https://firebirdsql.org/refdocs/langrefupd25-similar-to.html
To match the start or the end of a line, we use the following anchors: Caret (^) matches the position before the first character in the string. Dollar ($) matches the position right after the last character in the string.
End of String or Line: $ The $ anchor specifies that the preceding pattern must occur at the end of the input string, or before \n at the end of the input string. If you use $ with the RegexOptions. Multiline option, the match can also occur at the end of a line.
You can use RegEx in many languages like PHP, Python, and also SQL. RegEx lets you match patterns by character class (like all letters, or just vowels, or all digits), between alternatives, and other really flexible options.
From the Firebird 2.5 Language Reference, SIMILAR TO
documentation:
SIMILAR TO matches a string against an SQL regular expression pattern. Unlike in some other languages, the pattern must match the entire string in order to succeed—matching a substring is not enough.
In other words, the regular expression is multi-line and - given the linked documentation - provides no start/end anchors as those are already implied (but then whole string, not per line), as partial matches are not supported.
The regular expression implementation in Firebird conforms to the SQL standard, which also doesn't define start / end anchors.
Given your requirements, you probably need something like:
'(abc 1001( %)?)|((% )?1001 abc)'
Where ( %)?
means optionally match a space and zero or more of any character. Given the whole string must match, that means it finds either a space or the end of the string, and similar for (% )?
.
You may need to add additional terms if you also need to find this in the middle of a string (but none of your examples suggested that).
Or, a direct equivalent of (^|\s)1001(\s|$)
:
'(%[[:WHITESPACE:]])?1001([[:WHITESPACE:]]%)?'
An earlier version of this answer used (% |)
instead of (% )?
, but given empty terms are not documented nor part of the standard, that is possibly an implementation bug or at best an undocumented feature. Use that at your own risk.
Now, the (^|\s)1001(\s|$)
would not work since it means you want to get partial matches. It is not possible with SIMILAR TO
:
SIMILAR TO matches a string against an SQL regular expression pattern. Unlike in some other languages, the pattern must match the entire string in order to succeed—matching a substring is not enough.
Then, (^|\s)
means either start of string or whitespace. That means, you should check if the string has any chars and then a whitespace or just 1001
can appear at the start of the string. ($|\s)
means either end of string or whitespace. That means, you need to account for 3 cases:
1001
, whitespace and any chars1001
, whitesapce, any chars1001
You need to use
WHERE col SIMILAR TO '%[[:WHITESPACE:]]1001[[:WHITESPACE:]]%' or col SIMILAR TO '1001[[:WHITESPACE:]]%' or col SIMILAR TO '%[[:WHITESPACE:]]1001'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With