I need to parse a source code. I've identified 3 different types of tokens : symbols (operators, keywords), litterals (integers, strings, etc...) and identifiers.
I already have the following design, with a base class that keeps track of the type of the subclass so it can be downcasted using a base class pointer :
class Token
{
type_e type; // E_SYMBOL, E_LITTERAL, E_TOKEN
};
class Symbol : public Token
{
const symbol_e symbol;
};
class Litteral : public Token
{
const Value value;
};
class Identifier : public Token
{
const std::string name;
};
I need these classes to be stored in a single array of tokens, that's why i need them to have a common base class. Then i use them like this :
if( cur->type == E_SYMBOL && static_cast< const Symbol * >( cur )->symbol == E_LPARENT )
{
// ...
}
I could create virtual functions isSymbol, isLitteral, isIdentifer that each subclass would override, but i would still have to downcast the base class pointer to a subclass pointer so i can access the subclass's specific data.
People say downcasting means the interface is likely flawed, and it's making the syntax very heavy, so i'd like to find another way, but i can't. Some people suggested the visitor pattern, but i'm afraid this would uselessly complexify the code and i don't even understand how i could use the visitor pattern with this problem.
Can anyone help ? Thank you :)
You have three options. Each solution has it's advantages and disadvantages.
Put the logic into the token classes, so the calling code does not need to know which kind of token it is dealing with.
This would be the "purest object oriented" solution. The disadvantage is that the logic tends to spread between the base class and subclasses, which makes it harder to follow. It may also cause the classes to grow rather large. But compilers/interpreters don't usually have that many actions for this to be a problem.
Use the Visitor Pattern.
That is have an interface TokenVisitor
with visit
method overloaded for the token subtypes and accept(TokenVisitor&)
method on Token
which each subclass would override to call the appropriate overload of visit
.
You now need to know the complete set of token types in the interface, but it allows keeping the classes reasonably small and grouped by action the logic is usually easier to follow.
Use a discriminated union, for example Boost.Variant.
This is not object oriented at all. It will lead to switches over the type all over the place and will probably look ugly. But since the logic is all together it is often easier to follow, especially for somebody who does not understand the idea behind the code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With