With focus on read performance, I want to create a Term such as an Orddict or Proplist that contains a large number (100,000s) entries, each containing an ID and a Term value. This encapsulating Term should be able to return the a value stored under its key, just like an Orddict is able to do.
example:
K001 - Term001
K002 - Term002
K003 - Term003
The resulting Term containing the whole set needs to be passed from function to function, for several computing purposes without storing it on a persistence store to avoid disk I/O. I also chose not to use memory caching at this stage to avoid architectural complexity at this moment, therefore my focus is to have all of this to be simply key-searcheable.
Orddicts are key-sorted, which enhance the seek of a key, compared to a normal Dict. I am not aware of any other Erlang Module that can embed a more efficient indexing mechanism within its Term.
Any suggestions for an approach better than an Orddict ?
Actually, orddict
is implemented as a sorted list (source), so it performs poorly both for insertion and lookup, especially when the keys are inserted in ascending order. Stay away from it; it won't work for your use case. dict
is a hash-based data structure and offers solid insert/lookup performance. If the order of keys is important to you, consider using a tree-based map (such as gb_trees
) as you can extract an ordered key sequence by taking the in-order tree walk.
If you want to share a large dataset between Erlang processes, you can try to use ETS. It is fast in-memory key-value store, that only supports destructive updates.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With