I have been reading and experimenting with the standard library's smart pointers, unique_ptr
and shared_ptr
and although they are obviously great replacements for a lot of cases where raw pointers might be deemed dangerous I am unsure of their use when implementing data structures.
To experiment, I have written an example of a hash map which uses shared_ptr
- which according to the Meyer's Effective Modern C++ are roughly double the size of a unique_ptr
. For this reason I would like to use unique_ptr but I am kind of stumped due to what I am performing in the Add function, updating and copying.
Does anyone have any suggestions for this problem? Should data structures remain to be written using raw pointers?
#pragma once
#include "core.h"
const int TABLE_SIZE = 256;
template<typename K>
class HashKey {
public:
unsigned long operator()(const K& p_key) const {
return (p_key) % TABLE_SIZE;
}
};
template<typename K, typename T>
class HashNode {
public:
K m_key;
T m_value;
std::shared_ptr<HashNode> next = nullptr;
};
template<typename K, typename T, typename F = HashKey<K>>
class HashMap {
public:
std::array< std::shared_ptr< HashNode<K, T> >, 128 > m_table;
F m_hash_function;
int m_elem_count{ 0 };
void Add(K p_key, T p_value);
};
template<typename K, typename T, typename F = HashKey<K>>
void HashMap<K, T, F>::Add(K p_key, T p_value)
{
unsigned long key = m_hash_function(p_key);
std::shared_ptr<HashNode<K, T>> new_node = std::make_shared<HashNode<K, T>>();
new_node->m_key = p_key;
new_node->m_value = p_value;
if (m_table[key] == nullptr) {
/* only item in the bucket */
m_table[key] = std::move(new_node);
m_elem_count++;
}
else {
/* check if item exists so it is replaced */
std::shared_ptr< HashNode<K, T> > current = m_table[key];
std::shared_ptr< HashNode<K, T> > previous = m_table[key];
while (current != nullptr && p_key != current->m_key ) {
previous = current;
current = current->next;
}
if (current == nullptr) {
previous->next = new_node;
//current = new_node;
m_elem_count++;
}
else {
current->m_value = p_value;
}
}
}
void TestHashMap() {
HashMap<int, std::string> hash_map;
hash_map.Add(1, "one");
hash_map.Add(2, "does");
hash_map.Add(3, "not");
hash_map.Add(50, "simply");
hash_map.Add(11, "waltz");
hash_map.Add(11, "into");
hash_map.Add(191, "mordor");
std::cout << hash_map.m_elem_count << std::endl;
}
Smart pointers are class objects that behave like raw pointers but manage objects that are new and when or whether to delete them— smart pointers automatically delete the managed object at the appropriate time.
Use when you want to assign one raw pointer to multiple owners, for example, when you return a copy of a pointer from a container but want to keep the original. The raw pointer is not deleted until all shared_ptr owners have gone out of scope or have otherwise given up ownership.
In C++, a smart pointer is implemented as a template class that mimics, by means of operator overloading, the behaviors of a traditional (raw) pointer, (e.g. dereferencing, assignment) while providing additional memory management features.
Pointers are used to store and manage the addresses of dynamically allocated blocks of memory. Such blocks are used to store data objects or arrays of objects. Most structured and object-oriented languages provide an area of memory, called the heap or free store, from which objects are dynamically allocated.
The choice of smart pointer depends on how your data structure "owns" the heap-allocated objects.
If you need to simply observe, and not own an object (independently of whether it is heap-allocated or not), a raw pointer, a reference or an std::reference_wrapper
is the appropriate choice.
If you need unique ownership (at most one owner of the heap-allocated object), then use std::unique_ptr
. It has no additional time/memory overhead.
If you need shared ownership (any number of owners of the heap-allocated object), then use std::shared_ptr
. It results in additional time/memory overhead because an additional pointer (to the reference count metadata) has to be stored, and because accessing it is guaranteed to be thread-safe.
There's no need to use std::unique_ptr
(in place of a raw pointer) unless you actually need to own the object.
Assuming you need to own the object, there's no need to use std::shared_ptr
(in place of an std::unique_ptr
) unless you actually need shared ownership semantics.
In your case, it looks like you have a maximum of heap nodes in your HashMap
. Therefore, I'm assuming that you want the HashMap
instance to be the unique owner of the nodes.
What type should you use?
template<typename K, typename T, typename F = HashKey<K>>
class HashMap {
public:
std::array</* ? */, 128 > m_table;
// ...
};
You have two options:
If you want to store the objects with an heap indirection, use std::unique_ptr
, as the unique owner of these heap-allocated object is and will always be the HashMap
instance.
If you want to store the objects directly into the HashMap
, with no heap indirection, then do not use any pointer at all. This could lead to very big HashMap
instances. Interface for accessing the next
nodes becomes cumbersome.
Option 1 (store nodes in the heap):
This is the most common, and probably best option.
template<typename K, typename T, typename F = HashKey<K>>
class HashMap {
public:
std::array<std::unique_ptr<HashNode<K, T>>, 128 > m_table;
// ...
};
This will result in lighter (in terms of memory footprint) HashMap
instances.
Note: using an std::vector
in place of an std::array
will reduce the size of HashMap
significantly, but will introduce an additional heap indirection. This is the common way of implementing a similar data structure. You generally want the HashMap
instance to be as lightweight as possible, so that it can be copied/moved/stored efficiently.
There's no need to use smart pointers to connect the nodes between each other, as the nodes are owned exclusively by HashMap
. A raw pointer will be sufficient.
template<typename K, typename T>
class HashNode {
public:
// ...
HashNode* next_ptr = nullptr;
auto& next()
{
assert(next_ptr != nullptr);
return *next_ptr;
}
};
The code above will work fine, assuming that the HashMap
is still alive when accessing next
.
Option 2 (store nodes in the map instance):
template<typename K, typename T, typename F = HashKey<K>>
class HashMap {
public:
std::array<HashNode<K, T>, 128 > m_table;
// ...
};
HashMap
instances may be enormous, depending on the size of HashNode<K, T>
.
If you choose to store the nodes directly into the HashMap
with no heap indirection, you will have to use an index to access the internal array, as moving/copying the HashMap
around will change the memory address of the nodes.
template<typename K, typename T>
class HashNode {
public:
// ...
int next_index = -1;
auto& next(HashMap& map)
{
assert(next_index != -1);
return map.m_table[next_index];
}
};
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With