If I hold e reference to a part of a persistent collection, can the whole collection be garbage collected? Do I understand this correctly?
The function gctest is just to test the behavior of collections.
(defn gctest
"A shot about testing the gc-ability of persistent thingies."
[n]
(take 5 (drop 100 (vec (range n)))))
main=> (def a (gctest 1e7))
main=> (def b (gctest 1e7))
main=> (def c (gctest 1e7))
main=> (def d (gctest 1e7))
main=> (def e (gctest 1e7))
main=> (def f (gctest 1e7))
main=> (def g (gctest 1e7))
OutOfMemoryError GC overhead limit exceeded clojure.lang.ArrayChunk.dropFirst (ArrayChunk.java:54)
I already heard about head retention, but this seems a bit more general or isn't it?
What I want to understand is how I can use big changing collection. I expect big parts of the collections change over time, in a way that big parts can be garbage collected in principle, but not all of them.
Is there a standard way to cope with this?
An object is eligible to be garbage collected if its reference variable is lost from the program during execution. Sometimes they are also called unreachable objects. What is reference of an object? The new operator dynamically allocates memory for an object and returns a reference to it.
The main objective of Garbage Collector is to free heap memory by destroying unreachable objects. The garbage collector is the best example of the Daemon thread as it is always running in the background.
Garbage collection makes Java memory efficient because it removes the unreferenced objects from heap memory and makes free space for new objects. The Java Virtual Machine has eight types of garbage collectors.
Garbage collection has to scan memory, thus its complexity must be a function of N, where N may be the number of bytes or the number of objects. It must be bound by a linear function of N, since ultimately it needs to scan all of these objects.
Garbage collection services refers to the collection and removal of waste that cannot be recycled or reused. For this exercise, waste disposal services (such as landfill site operations) were not included in calculations. Waste diversion programs (such as recycling) were also separated and the benchmarking results are presented here.
They aren’t looking for things to delete, they know exactly what has to be removed. Garbage collection is a method to free the programmer from worrying about basic memory management. It isn’t the only option; simple reference counting also covers the vast majority of memory management needs.
Some cities have included the cost of waste disposal in their costs of garbage collection, whereas other cities have not. Waste disposal facilities are the single most expensive component of the waste program and these costs continue to increase as environmental regulations become more stringent.
Standard GC rule remains: as long as you keep the reference to a part of collection, all objects accessible from your references stay in memory. So only the part of collection that is accessible from your references will be hold, the rest will be collected. In particular, if you refer to the last 50 elements of 100 element list, first 50 elements will be collected and the rest will stay in memory.
However, in your case all elements of each collection starting from 100th will be kept. And the reason for it is lazy evaluation. Function take
produces lazy sequence of (in your case) 5 elements. Lazy sequence object itself is not real sequence, instead it is special generator object (though it's not Clojure's, but rather Python's term). When you need element of lazy sequence, generator object generates and returns it. But if you do not ask for the element, generator just keeps references to all objects it may need to generate an element.
In your example you create large vector and ask for 5 elements from it, and then save result to variables a
, b
, c
, etc. Clojure makes large vector and generator object, pointing to 100th element. Reference to collection itself is lost, but reference to generator object is saved on top level. You never evaluate generator objects and thus never make real 5 element sequences. REPL refers to vars a
, b
, c
, etc., these vars refer to generator objects, and generator objects refer to the collections they need to produce real 5 element sequences. Thus all elements (except may be first 100 of them) of all collections have to stay in memory.
On other hand, if you evaluate generator objects, they will produce real 5 element sequences and forget reference to the rest of collection. Try this:
user> (def a (gctest 1e7))
#'user/a
user> (println a)
(100 101 102 103 104)
nil
user> (def b (gctest 1e7))
#'user/b
user> (println b)
(100 101 102 103 104)
nil
user> (def c (gctest 1e7))
#'user/c
user> (println c)
(100 101 102 103 104)
nil
user> (def d (gctest 1e7))
#'user/d
user> (println d)
(100 101 102 103 104)
nil
user> (def e (gctest 1e7))
#'user/e
user> (println e)
(100 101 102 103 104)
nil
user> (def f (gctest 1e7))
#'user/f
user> (println f)
(100 101 102 103 104)
nil
user> (def g (gctest 1e7))
#'user/g
user> (println g)
(100 101 102 103 104)
nil
user> (def h (gctest 1e7))
#'user/h
user> (println h)
(100 101 102 103 104)
nil
user> (def i (gctest 1e7))
#'user/i
user> (println i)
(100 101 102 103 104)
nil
There's no OutOfMemory! Vars a
, b
, c
, etc. now store real lists of 5 elements, and thus there are no more references to large collections, so they can be collected.
These kinds of problems can be solved using lazy sequences. In your case you have used vec
function which actually create a vector in memory by going though each element generated by the range
function (range returns a lazy sequence).
Below code (without vec
call won't have the memory problem)
(defn gctest
"A shot about testing the gc-ability of persistent thingies."
[n]
(take 5 (drop 100 (range n))))
UPDATE:
When you use the vec
call, all the elements in the memory will be retained and not collected by GC because these elements are referenced (and required) by the sequence object returned from the function gctest
so that it can fetch the required elements (i.e skipping 100 elements and taking 5 elements) when the sequence object is asked for elements.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With