Recently I answered another question asking for questions every decent C++ programmer should be able to answer. My suggestion was
Q: How does a pointer point to an object?
A: The pointer stores the address of that object.
but user R.. disagrees with the A I propose to the Q - he says that The correct answer would be "it's implementation-specific". While present-day implementations store numeric addresses as pointers, there's no reason it couldn't be something much more elaborate.
Definitely I can't disagree that there could be other implementations except storing an address just for the sake of disagreeing. I'm really interested what other really used implementations are there.
What are other actually used implementations of pointers in C++ except storing an address in an integer type variable? How is casting (especially dynamic_cast
) implemented?
On a conceptual level, I agree with you -- I define the address of an object as "the information needed to locate the object in memory". What the address looks like, though, can vary quite a bit.
A pointer value these days is usually represented as a simple, linear address... but there have been architectures where the address format isn't so simple, or varies depending on type. For example, programming in real mode on an x86 (e.g. under DOS), you sometimes have to store the address as a segment:offset pair.
See http://c-faq.com/null/machexamp.html for some more examples. I found the reference to the Symbolics Lisp machine intriguing.
I would call Boost.Interprocess
as a witness.
In Boost.Interprocess
the interprocess pointers are offsets from the beginning of the mapped memory area. This allows to get the pointer from another process, map the memory area (which pointer address might be different from the one in the process which passed the pointer) and still get to the same object.
Therefore, interprocess pointers are not represented as addresses, but they can be resolved as one.
Thanks for watching :-)
If we are familiar with accessing array elements using pointer arithmetic it is easy to understand how objects are layed out in memory and how dynamic_cast
works. Consider the following simple class:
struct point
{
point (int x, int y) : x_ (x), y_ (y) { }
int x_;
int y_;
};
point* p = new point(10, 20);
Assume that p
is assigned to the memory location 0x01
. Its member variables are stored in their own disparate locations, say x_
is stored at 0x04
and y_
at 0x07
. It is easier to visualize the object p
as an array of pointers. p
(in our case (0x1
) points to the beginning of the array:
0x01
+-------+-------+
| | |
+---+---+----+--+
| |
| |
0x04 0x07
+-----+ +-----+
| 10 | | 20 |
+-----+ +-----+
So code to access the fields will essentially become accessing array elements using pointer arithmetic:
p->x_; // => **p
p->y_; // => *(*(p + 1))
If the language support some kind of automatic memory management, like GC, additional fields may be added to the object array behind the scene. Imagine a C++ implementation that collects garbage with the help of reference counting. Then the compiler might add an additional field (rc) to keep track of that count. The above array representation then becomes:
0x01
+-------+-------+-------+
| | | |
+--+----+---+---+----+--+
| | |
| | |
0x02 0x04 0x07
+--+---+ +-----+ +-----+
| rc | | 10 | | 20 |
+------+ +-----+ +-----+
The first cell points to the address of the reference count. The compiler will emit appropriate code to access the portions of p
that should be visible to the outside world:
p->x_; // => *(*(p + 1))
p->y_; // => *(*(p + 2))
Now it is easy to understand how dynamic_cast
works. Compiler deals with polymorphic classes by adding an extra hidden pointer to the underlying representation. This pointer contains the address of the beginning of another 'array' called the vtable, which in turn contain the addresses of the implementations of virtual functions in this class. But the first entry of the vtable is special. It does not point to a function address but to an object of a class called type_info
. This object contains the run-time type information of the object and pointers to type_info
s of its base classes. Consider the following example:
class Frame
{
public:
virtual void render (Screen* s) = 0;
// ....
};
class Window : public Frame
{
public:
virtual void render (Screen* s)
{
// ...
}
// ....
private:
int x_;
int y_;
int w_;
int h_;
};
An object of Window
will have the following memory layout:
window object (w)
+---------+
| &vtable +------------------+
| | |
+----+----+ |
+---------+ vtable | Window type_info Frame type_info
| &x_ | +------------+-----+ +--------------+ +----------------+
+---------+ | &type_info +------+ +----+ |
+---------+ | | | | | |
| &y_ | +------------------+ +--------------+ +----------------+
+---------+ +------------------+
+---------+ | &Window::render()|
+---------+ +------------------+
+---------+
| &h_ |
+---------+
Now consider what will happen when we try to cast a Window*
a Frame*
:
Frame* f = dynamic_cast<Frame*> (w);
dynamic_cast
will follow the type_info
links from the vtable of w
, confirms that Frame
is in its list of base classes and assign w
to f
. If it cannot find Frame
in the list, f
is set to 0
indicating that the casting failed. The vtable provides an economic way to represent the type_info
of a class. This is one reason why dynamic_cast
works only for classes with virtual
functions. Restricting dynamic_cast
to polymorphic types also makes sense from a logical point of view. This is, if an object has no virtual functions, it cannot safely be manipulated without knowledge of its exact type.
The target type of dynamic_cast
need not be polymorphic. This allows us to wrap a concrete type in a polymorphic type:
// no virtual functions
class A
{
};
class B
{
public:
virtual void f() = 0;
};
class C : public A, public B
{
virtual void f() { }
};
C* c = new C;
A* a = dynamic_cast<A*>(c); // OK
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With