Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Implementing a functional/persistent dictionary data structure

I'm trying to implement a functional dictionary in C. It is fairly easy to implement functional lists or b-trees but I can hardly find any references on dictionaries/associative arrays.

I had a look at erlang's dict implementation - in the source code they refer to this paper:
The Design and Implementation of Dynamic Hashing for Sets and Tables in Icon.

It would be great if someone could briefly explain erlang's approach or another solution to this problem.

like image 467
mirkokiefer Avatar asked Apr 21 '12 14:04

mirkokiefer


People also ask

Which data structure is used for implementing dictionary?

Dictionaries are often implemented as hash tables. Each element stored in a dictionary is identified by a key of type K. Dictionary represents a mapping from keys to values. Dictionaries have numerous applications.

Which data structure is used with functional programming languages?

Typically tuples, lists, and partially-evaluated functions are very common data structures in functional programming languages.

How would you implement a dictionary?

A Dictionary (also known as Table or Map) can be implemented in various ways: using a list, binary search tree, hashtable, etc., etc.

What is persistent data structure with example?

A data structure is said to be persistent if it is capable to maintaining its previous updates as separate versions and each version can be accessed and updated accordingly. It makes the data structure immutable and thread safe. For example, String class object in Java is immutable.


1 Answers

The implementation of a persistent data structure in C works basically the same way as it does in a functional language. Chris Okasaki's Purely Functional Data Structures is a great reference.

In general, it suffices to map fixed-width integers to objects, because while that doesn't give you a full-fledged dictionary by itself, you can build a dictionary on top: Use a hash of the actual key as the key of the underlying map, and have the leaves point to lists of (key, value) pairs of the same hash value.

The tricky part is memory management, since you don't generally know when parts of the data structure become unreachable. Luckily, since most persistent data structures are based on trees, reference counting usually works well. In order to be able to manage the objects referenced by the data structure, you can provide a hook for callbacks that get called when a leaf node's reference count becomes 0.

For example, my C implementation of bitmapped Patricia Trees provides the following API:

// Querying
void *bpt_get(bpt_t bpt, bpt_key_t key);
bool bpt_has_key(bpt_t bpt, bpt_key_t key);

// Adding and Removing Entries
bpt_t bpt_assoc(bpt_t bpt, bpt_key_t key, void *item);
bpt_t bpt_dissoc(bpt_t bpt, bpt_key_t key);

// Managing Memory
void bpt_retain(bpt_t bpt);
void bpt_release(bpt_t bpt);
void bpt_dealloc(bpt_t bpt);
void bpt_set_dealloc_hook(bpt_t bpt,
                          bpt_key_t key,
                          void (*hook)(bpt_key_t key,
                                       void* value));

// Iteration
void bpt_for_mappings(bpt_t bpt,
                      void (*thunk)(bpt_key_t, void*, void*),
                      void *user_data);

// Making a Map Persistent (you can elide this if you don't
// want to support transients)
void bpt_seal(bpt_t bpt);

The implementation might give you some ideas, too.

like image 94
Matthias Benkard Avatar answered Nov 07 '22 06:11

Matthias Benkard