Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In Java (1.5 or later), what is the best performing way to fetch an (any) element from a Set?

In the code below, I needed to fetch an element, any element, from toSearch. I was unable to find a useful method on the Set interface definition to return just a single (random, but not required to be random) member of the set. So, I used the toArray()[0] technique (present in the code below).

private Set<Coordinate> floodFill(Value value, Coordinate coordinateStart)
{
    Set<Coordinate> result = new LinkedHashSet<Coordinate>();

    Set<Coordinate> toSearch = new LinkedHashSet<Coordinate>();
    toSearch.add(coordinateStart);
    while (toSearch.size() > 0)
    {
        Coordinate coordinate = (Coordinate)toSearch.toArray()[0];
        result.add(coordinate);
        toSearch.remove(coordinate);
        for (Coordinate coordinateAdjacent: getAdjacentCoordinates(coordinate))
        {
            if (this.query.getCoordinateValue(coordinateAdjacent) == value)
            {
                if (!result.contains(coordinateAdjacent))
                {
                    toSearch.add(coordinateAdjacent);
                }
            }
        }
    }

    return result;
}

The other technique I have seen discussed is to replace "(Coordinate)toSearch.toArray()[0]" with "toSearch.iterator().next()". Which technique, toArray() or iterator(), is the most likely to execute the most quickly with the least GC (Garbage Collection) impact?

My intuition (after composing this question) is that the second technique using the Iterator will be both faster in execution and lower overhead for the GC. Given I don't know the implementation of the Set being passed (assuming HashSet or LinkedHashSet as most likely), how much overhead is incurred in each of the toArray() or iterator() methods? Any insights on this would be greatly appreciated.

Questions (repeated from above):

  1. Which technique, toArray() or iterator(), is the most likely to execute the most quickly with the least GC (Garbage Collection) impact?
  2. Given I don't know the implementation of the Set being passed (assuming HashSet or LinkedHashSet as most likely), how much overhead is incurred in each of the toArray() and iterator() methods?
like image 446
chaotic3quilibrium Avatar asked Dec 04 '10 23:12

chaotic3quilibrium


People also ask

What does set() do in java?

The set() method of Java Stack is used to replace any particular element in the stack created using the Stack class with another element. This can be done by specifying the position of the element to be replaced and the new element in the parameter of the set() method.

What is set<> java?

A Set is a Collection that cannot contain duplicate elements. It models the mathematical set abstraction. The Set interface contains only methods inherited from Collection and adds the restriction that duplicate elements are prohibited.

Can we iterate set in Java?

Set iterator() method in Java with Examplesutil. Set. iterator() method is used to return an iterator of the same elements as the set. The elements are returned in random order from what present in the set.


2 Answers

toSearch.iterator().next() will be faster and less memory-intensive because it does not need to copy any data, whereas toArray will allocate and copy the contents of the set into the array. This is irrespective of the actual implementation: toArray will always have to copy data.

like image 102
Cameron Skinner Avatar answered Sep 20 '22 01:09

Cameron Skinner


From what I can see you are doing Breadth First Search

Below is the example how it could be implemented without using toArray:

    private Set<Coordinate> floodFill(Value value, Coordinate coordinateStart) {
    final Set<Coordinate> visitedCoordinates = new LinkedHashSet<Coordinate>();
    final Deque<Coordinate> deque = new ArrayDeque<Coordinate>();

    deque.push(coordinateStart);

    while (!deque.isEmpty()) {
        final Coordinate currentVertex = deque.poll();
        visitedCoordinates.add(currentVertex);
        for (Coordinate coordinateAdjacent : getAdjacentCoordinates(currentVertex)) {
            if (this.query.getCoordinateValue(coordinateAdjacent) == value) {
                if (!visitedCoordinates.contains(coordinateAdjacent)) {
                    deque.add(coordinateAdjacent);
                }
            }
        }
    }

    return visitedCoordinates;
}

Implementation notes:

And now I am concerned that the contains() method's implementation on LinkedList could be doing up to a a full scan of the contents before returning the answer.

You are right about full scan (aka linear search). Nevertheless, In your case it's possible to have additional set for tracking already visited vertexes(btw, actually it's your result!), that would solve issue with contains method in O(1) time.

Cheers

like image 33
Petro Semeniuk Avatar answered Sep 20 '22 01:09

Petro Semeniuk