Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Map of Pointers versus Map of Structures/Containers (C++)

I am taking a data structures class, and in all of the examples from the professor, he always makes his map have a value of a pointer to a structure or container as opposed to holding the structure or container itself.

Is he just doing it as a habit or is there a good reason for it such as speed increases?

  • I am aware you can use pointers to data to avoid having redundant copies of data yet still house directions to that data in multiple containers/structures at the same time.
  • In these examples, that is not the case. The data is only in that map.
like image 991
user904963 Avatar asked Sep 17 '12 00:09

user904963


2 Answers

As I see it, there are a number of factors involved in deciding whether to use pointers vs. objects:

1. Do you or don't you need polymorphism?

If you want to maintain a container of base class objects, but then store objects of various derived classes in it, you must use pointers, because virtual function calls would otherwise not be resolved correctly.

2. The size of the objects you store and their suitability for copy operations

One of the key reasons why pointers may be preferrable to objects is that various operations performed on the container involve making copies of the objects stored in it. This is the case for many storing operations (e.g. std::vector<>::push_back() or std::map<>::insert()), some retrieval operations (e.g. std::vector<>::operator[], and then storing the object in a local variable), and some of the operations carried out by the container "internally", e.g. re-allocation of a vector when it grows beyond its capacity, or rehashing of std::unordered_map<>. Note that the copy operations may be less significant depending on how you choose the container and how make use of it (e.g. using std::vector<>::reserve() to allocate sufficient space, using std::vector<>::emplace_back() for storage, and never making a local copy of a retrieved element may mean that no copy is ever made).

However, if you expect a large number of copies to be made (or if profiling existing code reveals that many copies are made), using pointers instead of objects can obviously help as pointers are small and well-aligned in memory. Then again, this will not make much sense if the objects you store are actually smaller than pointers.

3. Other operations you perform on the container and its contents

Even if the objects you are dealing with are larger than pointers and you expect a significant amount of copy operations, using pointers is not necessarily preferrable. Consider a situation where you store a large number of medium-size objects (say, 16 bytes each) and you frequently need to iterate over the entire container and perform some sort of statistical calculation. When you store these objects directly in a vector, you get great cache-efficiency during iteration: As you retrieve one object, an entire cache-line will be retrieved from memory, hence making the retrieval of the next few objects much faster. This is generally not the case when pointers are used; on the contrary, after retrieving an element, the pointer must be dereferenced, causing another move operation from a memory region that is possibly not cached.

So clearly, it all depends on the type and size of objects you store, and the type and frequency of the operations you carry out. If the objects you are dealing with are various types of windows, buttons and menus of a GUI application, you will most likely want to use pointers and take advantage of polymorphism. If, on the other hand, you are dealing with huge structures of compact elements, all identical in size and shape, and the operations you perform involve frequent iteration or bulk copying, storing objects directly is perferrable. There may also be situations where the decision is hard to make without trying both and deciding based on the results of memory and time benchmarks.


As a final note, if you end up using pointers, consider whether the container you are building is the ultimate owner of the objects are you allocating on heap, or just maintains temporary pointers. If the container is the owner of those objects, you will be well-advised to use smart pointers rather than raw ones.

like image 189
jogojapan Avatar answered Sep 18 '22 17:09

jogojapan


The benefit of storing the object instances directly in the container is that you avoid a level of indirection & you save on the space that the pointer itself uses. You can win in terms of both time & space efficiency by storing object instances directly instead of storing pointers. If you understand a bit about how processor cache memory works, it's not hard to see how storing object instances "inline" in the container can have a real performance benefit.

Without making any assumptions about the contained type or the container usage pattern, the default container should be std::vector<T> (and not std::vector<T*>). Beginning with that default choice, you'll use something other than a vector if you can see how the usage pattern will benefit from the performance profile of that other type of structure. Likewise, you'll have the container store pointers to objects if pointer indirection is either required or seems to be worth it in performance terms. Indirection is required if the contained type isn't copy-constructable, and it's also required if the container doesn't "own" its objects.

like image 22
Adam McKee Avatar answered Sep 16 '22 17:09

Adam McKee