Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What functions a lexer needs to provide?

I am making a lexer, don't tell me to not do because I already did most of it.
Currently it makes an array of tokens and that's it.

I would like to know, what functions the lexer needs to provide and a brief explanation of what each function needs to do.

I'll accept the most complete list.

An example function would be:

next: Consume the current token and return it

Also, should the lexer have the expect function or should the interpreter implement it?

By the way, the lexer constructor accepts a string as argument and make the lexical analyses and store all the tokens in the "tokens" variable.

The language is javascript, so I can't overload operators.


1 Answers

In my experience, you need:

  • nextToken — move forward in the input and get the next token.
  • curToken — return the current token; don't move
  • curValue — tokens like STRING and NUMBER have values; tokens like SEMICOLON don't
  • sourcePos — return the source position (line number, character position) of the first character of the current token

edit — oh also:

  • prefetch — initialize the lexer by getting the first token.

Additionally, for some languages you might want 2 or more tokens of lookahead. Then you'd want a variation on plain curToken so that you can look at a bigger "window" on the token stream. For most languages that's not really necessary however.

edit again — also I won't tell you not to write one because they're basically the funnest things ever. In javascript you can't get too crazy, but in a language like Erlang you can have your lexer act like a "token pump" by making it generate a stream of tokens it sends to a separate parser process.

like image 81
Pointy Avatar answered Jan 30 '26 20:01

Pointy



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!