Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Garbage collector vs. collections

I have read few posts about garbage collection in Java, but still I cannot decide whether clearing a collection explicitly is considered a good practice or not... and since I could not find a clear answer, I decided to ask it here.

Consider this example:

List<String> list = new LinkedList<>();
// here we use the list, perhaps adding hundreds of items in it...
// ...and now the work is done, the list is not needed anymore
list.clear();
list = null;

From what I saw in implementations of e.g. LinkedList or HashSet, the clear() method basically just loops all the items in the given collection, setting all its elements (in case of LinkedList also references to next and previous elements) to null

If I got it right, setting the list to null just removes one reference from list - considering it was the only reference to it, the garbage collector will eventually take care of it. I just don't know how long would it take until also the list's elements are processed by garbage collector in this case.

So my question is - do the last two lines of the above listed example code actually help the garbage collector to work more efficiently (i.e. to collect the list's elements earlier) or would I just make my application busy with "irrelevant tasks"?

like image 337
Sva.Mu Avatar asked Sep 10 '15 21:09

Sva.Mu


People also ask

What is garbage collection?

Garbage collection (GC) is a memory recovery feature built into programming languages such as C# and Java. A GC-enabled programming language includes one or more garbage collectors (GC engines) that automatically free up memory space that has been allocated to objects no longer needed by the program.

What is the another name for garbage collector?

A waste collector, also known as a garbageman, garbage collector, trashman (in the US), binman or (rarely) dustman (in the UK), is a person employed by a public or private enterprise to collect and dispose of municipal solid waste (refuse) and recyclables from residential, commercial, industrial or other collection ...

What is garbage collection and types of garbage collection?

In computer science, garbage collection (GC) is a form of automatic memory management. The garbage collector attempts to reclaim memory which was allocated by the program, but is no longer referenced; such memory is called garbage.

What is the role of a garbage collector?

Garbage collectors pick up trash and recycling, and then transport it to a sorting facility, landfill, or recycling center. They may pick up trash from residential or commercial buildings. A garbage collector will either pick up the trash cans physically or by operating a truck.


2 Answers

The last two lines do not help.

  • Once the list variable goes out of scope*, if that's the last reference to the linked list then the list becomes eligible for garbage collection. Setting list to null immediately beforehand adds no value.

  • Once the list becomes eligible for garbage collection, so to do its elements if the list holds the only references to them. Clearing the list is unnecessary.

For the most part you can trust the garbage collector to do its job and do not need to "help" it.

* Pedantically speaking, it's not scope that controls garbage collection, but reachability. Reachability isn't easy to sum up in one sentence. See this Q&A for an explanation of this distinction.


One common exception to this rule is if you have code that will retain references longer than they're needed. The canonical example of this is with listeners. If you add a listener to some component, and later on that listener is no longer needed, you need to explicitly remove it. If you don't, that listener can inhibit garbage collection of both itself and of the objects it has references to.

Let's say I added a listener to a button like so:

button.addListener(event -> label.setText("clicked!"));

Then later on the label is removed, but the button remains.

window.removeChild(label);

This is a problem because the button has a reference to the listener and the listener has a reference to the label. The label can't be garbage collected even though it's no longer visible on screen.

This is a time to take action and get on the GC's good side. I need to remember the listener when I add it...

Listener listener = event -> label.setText("clicked!");
button.addListener(listener);

...so that I can remove it when I'm done with the label:

window.removeChild(label);
button.removeListener(listener);
like image 111
John Kugelman Avatar answered Oct 21 '22 00:10

John Kugelman


It depends on the following factors

  • how clear() is implemented
  • the allocation patterns for the entries held by the collection
  • the garbage collector
  • whether there might be other things holding onto the collection or subviews of it (does not apply to your example but common in the real world)

For a primitive, non-generational, tracing garbage-collector clearing out references only means extra work for without making things much easier on the GC. But clearing may still help if you cannot guarantee that all references to the collection are nulled out in a timely manner.

For generational GCs and especially G1GC nulling out references inside a collection (or a reference array) may be helpful under some circumstances by reducing cross-region references.

But that only helps if you actually have allocation patterns that create objects in different regions and put them into a collection living in a another region. And it also depends on the clear() implementation nulling out those references, which turns clearing into an O(n) operation when it could often be implemented as a O(1) one.

So for your concrete example the answer would be as follows:

If

  • your list is long-lived
  • the lists created on that code-path make up/hold onto a significant fraction of the garbage your application produces
  • you're using G1 or a similar multi-generational collector
  • slowly accumulates objects before eventually being released (this usually puts them in different regions, thus creating cross-region references)
  • you wish to trade CPU-time on clearing for reduced GC workload
  • the clear() implementation is O(n) instead of O(1), i.e. nulls out all entries. OpenJDK's 1.8 LinkedList does this.

then it may be beneficial to call clear() before releasing the collection itself.

So at best this is a very workload-specific micro-optimization that should only be applied after profiling/monitoring the application under realistic conditions and determining that GC overhead justifies the extra cost of clearing.


For reference, OpenJDK 1.8's LinkedList::clear

/**
 * Removes all of the elements from this list.
 * The list will be empty after this call returns.
 */
public void clear() {
    // Clearing all of the links between nodes is "unnecessary", but:
    // - helps a generational GC if the discarded nodes inhabit
    //   more than one generation
    // - is sure to free memory even if there is a reachable Iterator
    for (Node<E> x = first; x != null; ) {
        Node<E> next = x.next;
        x.item = null;
        x.next = null;
        x.prev = null;
        x = next;
    }
    first = last = null;
    size = 0;
    modCount++;
}
like image 44
the8472 Avatar answered Oct 21 '22 01:10

the8472