I just discovered that there are some tree based data structure that, when looking for high performance, are often times stored as a contiguous chunk of memory, this is especially popular when using the so called "policy based data structure" .
The problem is that I can't wrap my head around why one would like to do just that; when you try to "linearize" a tree to store it as a vector/array, how you make sure that you are re-arranging the branches and the leafs in a meaningful way that helps performance ? Is this ok only for perfectly balanced trees ?
In another words I can't imagine the pattern used for access a linear data structure that spans on multiple levels and has multiple leafs; usually a tree add 1 level of indirection for each node/leaf, and this simplifies things a lot for the user, but how one should organize such "linear" tree ?
In practice, if the binary tree were being constructed from a data set (e.g., data read from a file), then the node allocations would usually wind up being contiguous, since they were typically allocated sequentially and not interleaved with other allocations.
Binary trees in linked representation are stored in the memory as linked lists. These lists have nodes that aren't stored at adjacent or neighboring memory locations and are linked to each other through the parent-child relationship associated with trees.
When using an array to store a complete tree, why is the root node stored at index 1 instead of at the front of the array at index 0? We use index zero as a guard to prevent overstepping the root when propagating up the tree from its leaf nodes, which would cause a memory access fault.
An array is an example of a contiguous structure. Since each element in the array is located next to one or two other elements. In contrast, items in a non-contiguous structure and scattered in memory, but we linked to each other in some way. A linked list is an example of a non- contiguous data structure.
You might find the short article here interesting
Basically, the reasoning behind using a contiguous block of memory for such a structure is it greatly improves lookup and scan times when dealing with potentially large sets of data. If your memory is non-contiguous, you may have to employ costly traversal algorithms to retrieve data from the data structure.
Hopefully this addresses your interests.
Here are the two images from the article that illustrate this concept:
A balanced tree
The tree as stored in contiguous memory:
There are actually many such patterns, who have two purposes: saving memory, and keeping nodes togeather, primarily for paging performance.
One very simple version is simply to allocate blocks of three, a parent and two children, and this block has four "child" blocks, two for each child. This cuts your allocations by a third. This isn't much of an optimization, until you scale it up, allocations of 7, 15, 31, 63... if you can get it so that as many keys as possible fit onto a single memory system page, then you minimize the time spent waiting for the hard drive. If your keys are each 32 bytes, and a page is 4K, then you can store up to 125 keys, which means you only have to load one page from the hard drive for each 7 rows of the tree. At that point, you load the "child" page, and then follow another 7 rows. A regular binary tree may only have one node per page, which means you spend 7x as much time simply waiting for the harddrive as you iterate the tree. Quite slow. Rotates are a little tricky as you have to actually swap the data, not the pointers as is common with tree implementations. Also, it has a waste to use a LOT of space when the tree becomes larger.
---------
| O |
| / \ |
| O O |
_---------_
__/ \ / \__
/ | | \
--------- --------- --------- ---------
| O | | O | | O | | O |
| / \ | | / \ | | / \ | | / \ |
| O O | | O O | | O O | | O O |
--------- --------- --------- ---------
Another far more complex "pattern" is to cut the tree in half vertically, so the top is one "subtree", and it has many children "subtrees", and store each "subtree" linearly. And you repeat this recursively. This is a very strange pattern, but ends up working vaguely similar to the above pattern, except it's "cache-oblivious", which means it works with any page size, or cache hierarchy. Quite cool, but they're complex, and virtually everything runs on one of three well known architectures, so they aren't popular. They're also extremely difficult to insert/remove from
Another very simple variant is to put the whole tree into an array accessed via indecies, which saves total memory, but only the top is cache friendly, lower levels are worse cache wise than a regular binary tree. Effectively, the root is at index i=0, and the left child is at (n*2+1 = 1
), and the right child is at (n*2+2 = 2
). If we're at a node at index 24, it's parent is ((n-1)/2 = 12
), and it's left and right children are 49 and 50 respectively. This works great for small trees, because it requires no overhead for pointers whatsoever, The data is stored as a continuous array of values, and the relationships are inferred by the index. Also, adding and removing children always happens near the right end, and normal binary tree insertion/rotation/erasure applies. These also have an interesting mathematical novelty, that if you convert the index plus one to binary, that corresponds with the location in the tree. If we're thinking about the node at index 24, 24+1 in binary is 11001 -> The first 1 always means the root, and from there each 1 means "go right" and each 0 means "go left", which means to get to index 24 from the root you go right, left, left, right, and you're there. Additionally, since there's 5 binary digits, you know it's in the fifth row. Neither of these observations are particularly useful, other than they sort of imply that the root node is a right child, which is vaguely interesting. (If you expand to other-bases, the root is alway the rightmost child). That being said, it's still often useful to implement the root as a left node if you work with bidirectional iterators.
0
/ \
/ \
1 2
/ \ / \
3 4 5 6
[0][1][2][3][4][5][6]
Storing data structures in contiguous memory is a technique used on memory constrained systems, such as embedded systems. The technique may also be used on safety and performance critical systems.
The desktop systems usually have a lot of memory and their applications are short lived. Their process of dynamic memory allocation is to find the next available block in the memory pool and return it. If no memory is available (such as in fragmentation), the allocation fails. There is no control on how much memory can be consumed.
By having a contiguous allocation method, the amount of nodes created can be restricted or limited. This means in a system with 32k of memory, the tree won't use up all the memory and leave holes.
The allocation process is faster using a contiguous system. You know where the blocks are. Also, instead of storing pointers for the link, index values can be stored. This also allows for storing the tree into a file and retrieving easily.
You can model this by creating an array or vector of nodes. Change your node data structure to use array indices instead of pointers.
Remember, the only way to know about performance issues is to profile.
how you make sure that you are re-arranging the branches and the leafs in a meaningful way that helps performance?
If you have the program running already (with a non-contiguous tree), you could always just instrument your program to report what its actual node-access patterns typically look like. Once you have a good idea of how the nodes are accessed, you could then customize your node-allocator to allocate the nodes in memory in that same order.
" ... try to "linearize" a tree to store it as a vector/array, how you make sure that you are re-arranging the branches and the leafs in a meaningful way that helps performance ..."
I believe you are thinking too hard.
In a normal tree, you use 'new' to request free space in which to create a node.
You use delete to return the no longer needed space to the heap.
Then connect the nodes using pointers.
For a 'tree-in-a-vector', you might might simply re-implement your new and delete to find space in a vector.
I think instead of pointers (to parent, or left or right node) you use the index (to parent, or left or right node).
I believe the index of the nth item in a vector (both before and after the re-allocate for growth) is unchanged.
Another challenge is the delete of a node ... but this can be as simple as any node (or index) greater than the erased node reduces by 1.
These choices can be a fair trade for a tree that changes seldom, but must be captured as quickly as possible.
Does storing your tree really require the performance of a vector block save? Is the vector block save actually faster than depth-first save of the same tree. You really need to measure.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With