Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a regex without certain group of letters in lex

I've recently started learning lex , so I was practicing and decided to make a program which recognises a declaration of a normal variable. (Sort of)

This is my code :

%{
#include "stdio.h"
%}
dataType "int"|"float"|"char"|"String"
alphaNumeric [_\*a-zA-Z][0-9]*
space [ ]
variable {dataType}{space}{alphaNumeric}+
%option noyywrap
%%
{variable} printf("ok");
. printf("incorect");
%%
int main(){
yylex();
}

Some cases when the output should return ok

int var3
int _varR3
int _AA3_

And if I type as input : int float , it returns ok , which is wrong because they are both reserved words.

So my question is what should I modify to make my expression ignore the 'dataType' words after space?

Thank you.

like image 379
maspinu Avatar asked Oct 18 '22 21:10

maspinu


1 Answers

A preliminary consideration: Typically, detection of the construction you point out is not done at the lexing phase, but at the parsing phase. On yacc/bison, for instance, you would have a rule that only matches a "type" token followed by an "identifier" token.

To achieve that with lex/flex though, you could consider playing around with the negation (^) and trailing context (/) operators. Or...

If you're running flex, perhaps simply surrounding all your regex with parenthesis and passing the -l flag would do the trick. Notice there are a few differences between lex and flex, as described in the Flex manual.

like image 172
Leandro T. C. Melo Avatar answered Nov 15 '22 06:11

Leandro T. C. Melo