Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Memory sharing of persistent data structures in clojure

Tags:

clojure

I have started learning Clojure and was reading about structural sharing. I am confused in the following scenario : following clojure codes are typed in REPL in the sequence defined below :

1) (def a [1 2 3]),
2) (def b a),
3) (def a (conj a 4)),
4) (def b (conj b 5)),

After the 4th step, will a and b share the structure for the first three elements or all values will be copied on the execution of 4th step ? If the structure is shared, how will Clojure be able to return us the value at say index 3 ?

This is somewhat related to Structural Sharing in Clojure, but I am still confused. Any kind of help will be appreciated.

like image 458
abhishekmahawar Avatar asked Mar 23 '23 11:03

abhishekmahawar


1 Answers

In the example given in the question text, no structural sharing happens at all. This is because vectors are implemented as trees, where the actual elements are stored in leaf nodes of size 32 (with the final leaf stored separately as the "tail" of a vector -- a performance optimization) and branch nodes are likewise 32-way. So, in order for structural sharing to come into play, one needs a large enough vector:

;; a base vector:
(def v1 (vec (range 31)))

;; no structural sharing -- all elements are copied:
(def v2 (conj v1 31))

;; leftmost leaf of internal tree uses v2's tail as its internal array:
(def v3 (conj v2 32))

;; leftmost leaf shared with v3
(def v4 (conj v3 33))

In general, whenever one conjs an object onto an existing vector, the new vector either (1) shares the entire internal tree with the original but has a new tail or (2) shares with the original, at each level of the original's internal tree, all nodes except the rightmost one (and may have an internal tree taller by one level than that of the original). (Clearly all the elements of the original vector are shared with the new one, however.)

As for looking up values by index, it happens in the same way in every case -- the vectors don't care whether their structure happens to be shared with other vectors (and there is no reason why they'd need to, given that it never changes).

like image 129
Michał Marczyk Avatar answered Apr 01 '23 13:04

Michał Marczyk