Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does Tokens do and why they need to be created in C++ programming?

Tags:

c++

c++11

token

I am reading a book (Programming Principles and Practice by Bjarne Stroustrup).

In which he introduce Tokens:

“A token is a sequence of characters that represents something we consider a unit, such as a number or an operator. That’s the way a C++ compiler deals with its source. Actually, “tokenizing” in some form or another is the way most analysis of text starts.”

class Token {
public:
    char kind;
    double value;
};

I do get what they are but he never explains this in detail and its quite confusing to me.

like image 614
elizabeth Avatar asked Nov 15 '17 13:11

elizabeth


Video Answer


1 Answers

Tokenizing is important to the process of figuring out what a program does. What Bjarne is referring to in relation to C++ source deals with how a programs meaning is affected by the tokenization rules. In particular, we must know what the tokens are, and how they are determined. Specifically, how can we identify a single token when it appears next to other characters, and how should we delimit tokens if there is ambiguity.

For instance, consider the prefix operators ++ and +. Let's assume we only had one token + to work with. What is the meaning of the following snippet?

int i = 1;
++i;

With + only, is the above going to just apply unary + on i twice? Or is it going to increment it once? It's ambiguous, naturally. We need an additional token, and therefore introduce ++ as it's own "word" in the language.

But now there is another (albeit smaller) problem. What if the programmer wants to just apply unary + twice, and not increment? Token processing rules are needed. So if we determine that a white space is always a separator for tokens, our programmer may write:

int i = 1;
+ +i;

Roughly speaking, a C++ implementation starts with a file full of characters, transforms them initially to a sequence of tokens ("words" with meaning in the C++ language), and then checks if the tokens appear in a "sentence" that has some valid meaning.

like image 105
StoryTeller - Unslander Monica Avatar answered Nov 15 '22 07:11

StoryTeller - Unslander Monica