Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what should the output of a lexer be in c?

Tags:

c

lex

#include<stdio.h>

int main()
{
  int a,b;
  a=a+b;
  printf("%d",a);
return 0;
}

what should be the output if this code is passed through a lexer

like image 708
Hick Avatar asked Apr 18 '10 12:04

Hick


People also ask

What does a Lexer return?

1.1 Creating a Lexer. Produces a function that takes an input-port, matches the re's against the buffer, and returns the result of executing the corresponding action-expr.

What is Lexer in C?

Lexer is used to pre-process the source code, so as to reduce the complexity of parser. Lexer is also a kind of compiler which consumes source code and output token stream. lookahead(k) is used to fully determine the meaning of current character/token.

How does a Lexer work?

They are called scannerless parsers. A lexer and a parser work in sequence: the lexer scans the input and produces the matching tokens, the parser then scans the tokens and produces the parsing result.

What is Lexer file?

The lexer is contained in the file lex.cc . It is a hand-coded lexer, and not implemented as a state machine. It can understand C, C++ and Objective-C source code, and has been extended to allow reasonably successful preprocessing of assembly language.


1 Answers

the lexer just tokenizes the stream to turn a stream of characters into a stream of tokens (that will be parsed with a parser later to obtain a full syntax tree). For your example you would obtain something like:

#include <stdio.h> (this is handled by preprocessor, not by lexer so it wouldn't exist)

int KEYWORD
main IDENTIFIER
( LPAR
) RPAR
{ LBRACE
int KEYWORD
a IDENT
, COMMA
b IDENT
; SEMICOL
a IDENT
= ASSIGN
a IDENT
+ PLUS
b IDENT
; SEMICOL
printf IDENT
( LPAR
"%d" STRING
, COMMA
a IDENT
) RPAR
; SEMICOL
return RETURN_KEYWORD
0 INTEGER
; SEMICOL
} RBRACE

Of course a lexer by itself can't do much, it can just split the source into smallest elements possible, checking for syntax errors (like misspelled keywords). You will need something that will combine them to give them a semantic meaning.

Just a side note: some lexers like to group similar kinds of tokens in just one (for example a KEYWORD token that contains all keywords) using a parameter associated with it, while others have a different token for every one like RETURN_KEYWORK, IF_KEYWORD and so on..

like image 193
Jack Avatar answered Sep 19 '22 00:09

Jack