How is LLVM isa implemented?

Question

From http://llvm.org/docs/CodingStandards.html#ci_rtti_exceptions

LLVM does make extensive use of a hand-rolled form of RTTI that use templates like isa<>, cast<>, and dyn_cast<>. This form of RTTI is opt-in and can be added to any class. It is also substantially more efficient than dynamic_cast<>.

How is isa and the others implemented?

Matthieu M. · Accepted Answer

First of all, the LLVM system is extremely specific and not at all a drop-in replacement for the RTTI system.

Premises

For most classes, it is unnecessary to generate RTTI information
When it is required, the information only makes sense within a given hierarchy
We preclude multi-inheritance from this system

Identifying an object class

Take a simple hierarchy, for example:

struct Base {}; /* abstract */
struct DerivedLeft: Base {}; /* abstract */
struct DerivedRight:Base {};
struct MostDerivedL1: DerivedLeft {};
struct MostDerivedL2: DerivedLeft {};
struct MostDerivedR: DerivedRight {};

We will create an enum specific to this hierarchy, with an enum member for each of the hierarchy member that can be instantiated (the others would be useless).

enum BaseId {
  DerivedRightId,
  MostDerivedL1Id,
  MostDerivedL2Id,
  MostDerivedRId
};

Then, the Base class will be augmented with a method that will return this enum.

struct Base {
  static inline bool classof(Base const*) { return true; }

  Base(BaseId id): Id(id) {}

  BaseId getValueID() const { return Id; }

  BaseId Id;
};

And each concrete class is augmented too, in this manner:

struct DerivedRight: Base {
  static inline bool classof(DerivedRight const*) { return true; }

  static inline bool classof(Base const* B) {
    switch(B->getValueID()) {
    case DerivedRightId: case MostDerivedRId: return true;
    default: return false;
    }
  }

  DerivedRight(BaseId id = DerivedRightId): Base(id) {}
};

Now, it is possible, simply, to query the exact type, for casting.

Hiding implementation details

Having the users murking with getValueID would be troublesome though, so in LLVM this is hidden with the use of classof methods.

A given class should implement two classof methods: one for its deepest base (with a test of the suitable values of BaseId) and one for itself (pure optimization). For example:

struct MostDerivedL1: DerivedLeft {
  static inline bool classof(MostDerivedL1 const*) { return true; }

  static inline bool classof(Base const* B) {
    return B->getValueID() == MostDerivedL1Id;
  }

  MostDerivedL1(): DerivedLeft(MostDerivedL1Id) {}
};

This way, we can check whether a cast is possible or not through the templates:

template <typename To, typename From>
bool isa(From const& f) {
  return To::classof(&f);
}

Imagine for a moment that To is MostDerivedL1:

if From is MostDerivedL1, then we invoke the first overload of classof, and it works
if From is anything other, then we invoke the second overload of classof, and the check uses the enum to determine if the concrete type match.

Hope it's clearer.

Anton Korobeynikov · Answer

Just adding stuff to osgx's answer: basically each class should implement classof() method which does all the necessary stuff. For example, the Value's classof() routine looks like this:

  // Methods for support type inquiry through isa, cast, and dyn_cast:
  static inline bool classof(const Value *) {
    return true; // Values are always values.
  }

To check whether we have a class of the appropriate type, each class has it's unique ValueID. You can check the full list of ValueID's inside the include/llvm/Value.h file. This ValueID is used as follows (excerpt from Function.h):

  /// Methods for support type inquiry through isa, cast, and dyn_cast:
  static inline bool classof(const Function *) { return true; }
  static inline bool classof(const Value *V) {
    return V->getValueID() == Value::FunctionVal;
  }

So, in short: every class should implement classof() method which performs the necessary decision. The implementation in question consists of the set of unique ValueIDs. Thus in order to implement classof() one should just compare the ValueID of the argument with own ValueID.

If I remember correctly, the first implementation of isa<> and friends were adopted from boost ~10 years ago. Right now the implementations diverge significantly :)

How is LLVM isa<> implemented?

Tags:

c++

llvm

2 Answers

Matthieu M.

Anton Korobeynikov

Recent Activity

Donate For Us