Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what exactly is a token, in relation to parsing

I have to use a parser and writer in c++, i am trying to implement the functions, however i do not understand what a token is. one of my function/operations is to check to see if there are more tokens to produce

bool Parser::hasMoreTokens()

how exactly do i go about this, please help

SO!

I am opening a text file with text in it, all words are lowercased. How do i go about checking to see if it hasmoretokens?

This is what i have

bool Parser::hasMoreTokens() {

while(source.peek()!=NULL){
    return true;
}
    return false;
}
like image 485
Technupe Avatar asked Apr 12 '11 17:04

Technupe


1 Answers

Tokens are the output of lexical analysis and the input to parsing. Typically they are things like

  • numbers
  • variable names
  • parentheses
  • arithmetic operators
  • statement terminators

That is, roughly, the biggest things that can be unambiguously identified by code that just looks at its input one character at a time.

One note, which you should feel free to ignore if it confuses you: The boundary between lexical analysis and parsing is a little fuzzy. For instance:

  1. Some programming languages have complex-number literals that look, say, like 2+3i or 3.2e8-17e6i. If you were parsing such a language, you could make the lexer gobble up a whole complex number and make it into a token; or you could have a simpler lexer and a more complicated parser, and make (say) 3.2e8, -, 17e6i be separate tokens; it would then be the parser's job (or even the code generator's) to notice that what it's got is really a single literal.

  2. In some programming languages, the lexer may not be able to tell whether a given token is a variable name or a type name. (This happens in C, for instance.) But the grammar of the language may distinguish between the two, so that you'd like "variable foo" and "type name foo" to be different tokens. (This also happens in C.) In this case, it may be necessary for some information to be fed back from the parser to the lexer so that it can produce the right sort of token in each case.

So "what exactly is a token?" may not always have a perfectly well defined answer.

like image 88
Gareth McCaughan Avatar answered Sep 28 '22 09:09

Gareth McCaughan