When I tried to use regex in flex as following to define an int type:
int (?<!\w)(([1-9]\d*)|0)(?!\w)
I meant to make this invalid:
int a = 123;
int b = 123f; //the '123' should not filtered as an int type
However, I got this:
bad character: <
bad character: !
bad character: \
...
What's more, it seems that the ?
in the regex was ignored.
I got confused.
Does flex not support the lookahead assertion (?<=xxx)
or (?<!xxx)
?
I am new in flex, I really need some help
A lookahead assertion has the form (?= test) and can appear anywhere in a regular expression. MATLAB® looks ahead of the current location in the text for the test condition. If MATLAB matches the test condition, it continues processing the rest of the expression to find a match.
Lookbehind. Asserts that what immediately precedes the current position in the string is foo. (?!foo) Negative Lookahead. Asserts that what immediately follows the current position in the string is not foo.
Definition of look ahead : to think about what will happen in the future The past year has been successful and, looking ahead, we expect to do even better in the coming months. —often + to Looking ahead to next year, we expect to be even more successful.
Negative lookahead, which is what you're after, requires a more powerful tool than the standard grep . You need a PCRE-enabled grep. If you have GNU grep , the current version supports options -P or --perl-regexp and you can then use the regex you wanted.
That's correct. Flex does not support negative lookahead assertions. It also does not support \w
or \d
, although it does allow posix-style character classes ([[:alpha:]]
, [[:digit:]]
, [[:alnum:]]
, etc.)
Flex regular expressions are quite different from javascript-like or perl/python-like "regular" expressions. For one thing, flex's regular expressions are really regular.
A complete list of the syntaxes flex allows is in the flex manual. Anything not described in that section of the manual is not implemented by flex.
There is very little point using "lookbehind" with flex, because flex always matches the longest token at the current input point. It does not search the input for a pattern.
Flex does implement a limited form of positive lookahead, using the /
operator (which is not part of any regular expression library I know of.) You could use that to only match a sequence of digits not immediately followed by a letter:
[[:digit:]]+/[^[:alpha:]]
But you'll then need some pattern which does match the sequence of digits followed by an alphabetic character, because flex does not search for a matching token.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With