Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Flex seems do not support a regex lookahead assertion (the fast lex analyzer)

When I tried to use regex in flex as following to define an int type:

int    (?<!\w)(([1-9]\d*)|0)(?!\w)

I meant to make this invalid:

int a = 123;
int b = 123f; //the '123' should not filtered as an int type

However, I got this:

bad character: <
bad character: !
bad character: \
...

What's more, it seems that the ? in the regex was ignored. I got confused. Does flex not support the lookahead assertion (?<=xxx) or (?<!xxx) ?

I am new in flex, I really need some help

like image 937
stanleyerror Avatar asked Mar 11 '14 12:03

stanleyerror


People also ask

What is lookahead assertion in regex?

A lookahead assertion has the form (?= test) and can appear anywhere in a regular expression. MATLAB® looks ahead of the current location in the text for the test condition. If MATLAB matches the test condition, it continues processing the rest of the expression to find a match.

What is lookahead and Lookbehind in regex?

Lookbehind. Asserts that what immediately precedes the current position in the string is foo. (?!foo) Negative Lookahead. Asserts that what immediately follows the current position in the string is not foo.

How do you use lookahead?

Definition of look ahead : to think about what will happen in the future The past year has been successful and, looking ahead, we expect to do even better in the coming months. —often + to Looking ahead to next year, we expect to be even more successful.

Does grep support negative lookahead?

Negative lookahead, which is what you're after, requires a more powerful tool than the standard grep . You need a PCRE-enabled grep. If you have GNU grep , the current version supports options -P or --perl-regexp and you can then use the regex you wanted.


1 Answers

That's correct. Flex does not support negative lookahead assertions. It also does not support \w or \d, although it does allow posix-style character classes ([[:alpha:]], [[:digit:]], [[:alnum:]], etc.)

Flex regular expressions are quite different from javascript-like or perl/python-like "regular" expressions. For one thing, flex's regular expressions are really regular.

A complete list of the syntaxes flex allows is in the flex manual. Anything not described in that section of the manual is not implemented by flex.

There is very little point using "lookbehind" with flex, because flex always matches the longest token at the current input point. It does not search the input for a pattern.

Flex does implement a limited form of positive lookahead, using the / operator (which is not part of any regular expression library I know of.) You could use that to only match a sequence of digits not immediately followed by a letter:

[[:digit:]]+/[^[:alpha:]]

But you'll then need some pattern which does match the sequence of digits followed by an alphabetic character, because flex does not search for a matching token.

like image 59
rici Avatar answered Oct 17 '22 13:10

rici