Order of precedence for token matching in Flex

Question

My apologies if the title of this thread is a little confusing. What I'm asking about is how does Flex (the lexical analyzer) handle issues of precedence?

For example, let's say I have two tokens with similar regular expressions, written in the following order:

"//"[!\/]{1}    return FIRST;
"//"[!\/]{1}\<  return SECOND;

Given the input "//!<", will FIRST or SECOND be returned? Or both?

The FIRST string would be reached before the SECOND string, but it seems that returning SECOND would be the right behavior.

Bart Kiers · Accepted Answer

The longest match is returned.

From flex & bison, Text Processing Tools:

How Flex Handles Ambiguous Patterns

Most flex programs are quite ambiguous, with multiple patterns that can match the same input. Flex resolves the ambiguity with two simple rules:

Match the longest possible string every time the scanner matches input.

In the case of a tie, use the pattern that appears first in the program.

You can test this yourself, of course:

file: demo.l

%%
"//"[!/]   {printf("FIRST");}
"//"[!/]<  {printf("SECOND");}
%%

int main(int argc, char **argv)
{
    while(yylex() != 0);
    return 0;
}

Note that / and < don't need escaping, and {1} is redundant.

bart@hades:~/Programming/GNU-Flex-Bison/demo$ flex demo.l 
bart@hades:~/Programming/GNU-Flex-Bison/demo$ cc lex.yy.c  -lfl
bart@hades:~/Programming/GNU-Flex-Bison/demo$ ./a.out < in.txt 
SECOND

where in.txt contains //!<.

Order of precedence for token matching in Flex

Tags:

tokenize

lexical-analysis

flex-lexer

Casey Patton

1 Answers

How Flex Handles Ambiguous Patterns

file: demo.l

Bart Kiers

Recent Activity

Donate For Us

Order of precedence for token matching in Flex

Tags:

tokenize

lexical-analysis

flex-lexer

Casey Patton

1 Answers

How Flex Handles Ambiguous Patterns

file: demo.l

Bart Kiers

Related questions

Recent Activity

Donate For Us