In general, binary tree based abstractions can be implemented either using actual linked node objects, where each node has pointers to it's two children, or an array, where the children of node in index k are 2k and 2k+1.
Other than the small extra memory overhead of nodes, the complexity in general seems to be identical.
Are there any concrete advantages of one over the other? Anecdotally, I've seen that binary heaps tend to use the array implementation, while binary search trees tend to use linked nodes implementation. Any reason for this?
In a linked list, the items are linked together through a single next pointer. In a binary tree, each node can have 0, 1 or 2 subnodes, where (in case of a binary search tree) the key of the left node is lesser than the key of the node and the key of the right node is more than the node.
Arrays are almost always preferable. Linked lists will perform better when both of the following conditions are met: insertions or deletions in the middle are frequent; the position where the insertion or deletion occurs is found by following a pointer, and not by traversing the list.
1. An array is a grouping of data elements of equivalent data type. A linked list is a group of entities called a node. The node includes two segments: data and address.
Arrays cannot efficiently represent arbitrarily-shaped binary trees, only complete trees. A complete binary tree is one in which all levels are full, OR all levels except the deepest level are full and the deepest level has all of its nodes as far to the left as possible. (You can imagine that the levels are filled with nodes from left to right, and one level has to be filled before the next level can begin.)
Heaps are, by definition, complete binary trees - hence the array implementation is used due to its superior memory efficiency. On the other hand, binary search trees that must support insertion and removal at arbitrary locations (and thus may not be complete trees) cannot use the array implementation.
First of all, it is a legitimate question: binary trees can indeed be embedded in arrays. phari's answer is incorrect: it is possible with some effort to embed trees of arbitrary shapes into arrays (as long as you have enough memory). A straightforward representation would involve defining a Node
as a variant type: either it is Filled
or Empty
, where Filled
contains the key and the auxiliary data, and Empty
is analogous to Nil
(aka the null pointer). If the only operation you need to support is delete
, then you're all set up: just implement the build
operation to return a complete tree and then implement the normal binary tree delete
operation. No balancing required to achieve O(log n)
complexity bound (where n
is the initial number of items the tree).
It is also possible with a significant effort to implement the insert
operation by maintaining the tree in a balanced shape. Simplifying a bit, you maintain a nearly-complete tree with storage size no more than 2n
(where n
is the current number of items in the tree). When a new item is inserted, you check where the appropriate array cell to insert it is: if it is inside the allocated storage, you just write it to that cell. Otherwise, you go up the tree starting from the parent of that cell until you find a subtree whose storage has enough room to fit all items including the new one; if no such subtree exists, you double the storage. After finding that subtree, you rebuild it into a nearly-complete shape and insert the new item into the correct array cell (which is now guaranteed to be within the allocated storage). All this can be done in amortized O(log^2(n))
time, or in amortized O(log(n))
time with even more effort.
The above algorithm sketch is based on "Cache Oblivious Search Trees via Binary Trees of Small Height".
I have implemented a data structure called TeardownTree, which uses this kind of embedding. I support build
, delete
, delete_range
, query_range
, iter
operations on the master branch, and an experimental insert
operation on the insert
branch. The project page also contains some benchmarks that show the data structure is definitely viable at least for some usages.
You might also be interested in this blog post explaining how to build trees in constant auxiliary space (a very fast method in practice).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With