I'm building a symbol table for a project I'm working on. I was wondering what peoples opinions are on the advantages and disadvantages of the various methods available for storing and creating a symbol table.
I've done a fair bit of searching and the most commonly recommended are binary trees or linked lists or hash tables. What are the advantages and or disadvantages of all of the above? (working in c++)
In linked lists, the nodes can be in any place in memory, because each of them has a pointer to the next node. A hash table is different from either because it doesn't store its elements in any particular order.
Binary Search Trees are generally memory-efficient since they do not reserve more memory than they need to. On the other hand, Hash tables can be a bit more demanding if we don't know the exact number of elements we want to store.
A hash table is just an array of linked lists. Each linked list holds all the items in the table that share the same hash code. Initially, all the lists are empty (represented as null in the array).
In a linked list, the items are linked together through a single next pointer. In a binary tree, each node can have 0, 1 or 2 subnodes, where (in case of a binary search tree) the key of the left node is lesser than the key of the node and the key of the right node is more than the node.
The standard trade offs between these data structures apply.
Your use case is presumably going to be "insert the data once (e.g., application startup) and then perform lots of reads but few if any extra insertions".
Therefore you need to use an algorithm that is fast for looking up the information that you need.
I'd therefore think the HashTable was the most suitable algorithm to use, as it is simply generating a hash of your key object and using that to access the target data - it is O(1). The others are O(N) (Linked Lists of size N - you have to iterate through the list one at a time, an average of N/2 times) and O(log N) (Binary Tree - you halve the search space with each iteration - only if the tree is balanced, so this depends on your implementation, an unbalanced tree can have significantly worse performance).
Just make sure that there are enough spaces (buckets) in the HashTable for your data (R.e., Soraz's comment on this post). Most framework implementations (Java, .NET, etc) will be of a quality that you won't need to worry about the implementations.
Did you do a course on data structures and algorithms at university?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With