Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should Iterator or Iterable be used when exposing internal collection items?

I have a class with a private mutable list of data.

I need to expose list items given following conditions:

  • List should not be modifiable outside;
  • It should be clear for developers who use getter function that a list they get can not be modified.

Which getter function should be marked as recommended approach? Or can you offer a better solution?

class DataProcessor {
    private final ArrayList<String> simpleData = new ArrayList<>();
    private final CopyOnWriteArrayList<String> copyData = new CopyOnWriteArrayList<>();

    public void modifyData() {
        ...
    }

    public Iterable<String> getUnmodifiableIterable() {
        return Collections.unmodifiableCollection(simpleData);
    }

    public Iterator<String> getUnmodifiableIterator() {
        return Collections.unmodifiableCollection(simpleData).iterator();
    }

    public Iterable<String> getCopyIterable() {
        return copyData;
    }

    public Iterator<String> getCopyIterator() {
        return copyData.iterator();
    }
}

UPD: this question is from a real code-review discussion on the best practice for list getter implementation

like image 720
kza Avatar asked May 12 '15 15:05

kza


2 Answers

The "best" solution actually depends on the intended application patterns (and not so much on "opinions", as suggested by a close-voter). Each possible solution has pros and cons that can be judged objectively (and have to be judged by the developer).


Edit: There already was a question "Should I return a Collection or a Stream?", with an elaborate answers by Brian Goetz. You should consult this answers as well before making any decision. My answer does not refer to streams, but only to different ways of exposing the data as a collection, pointing out the pros, cons and implications of the different approaches.


Returning an iterator

Returning only an Iterator is inconvenient, regardless of further details, e.g. whether it will allow modifications or not. An Iterator alone can not be used in the foreach loop. So clients would have to write

Iterator<String> it = data.getUnmodifiableIterator();
while (it.hasNext()) {
    String s = it.next();
    process(s);
}

whereas basically all other solutions would allow them to just write

for (String s : data.getUnmodifiableIterable()) {
    process(s);
}

Exposing a Collections.unmodifiable... view on the internal data:

You could expose the internal data structure, wrapped into the corresponding Collections.unmodifiable... collection. Any attempt to modify the returned collection will cause an UnsupportedOperationException to be thrown, clearly stating that the client should not modify the data.

One degree of freedom in the design space here is whether or not you hide additional information: When you have a List, you could offer a method

private List<String> internalData;

List<String> getData() {
    return Collections.unmodifiableList(internalData);
}

Alternatively, you could be less specific about the type of the internal data:

  • If the caller should not be able to do indexed access with the List#get(int index) method, then you could change the return type of this method to Collection<String>.
  • If the caller additionally should not be able to obtain the size of the returned sequence by calling Collection'size(), then you could return an Iterable<String>.

Also consider that, when exposing the less specific interfaces, you later have the choice to change the type of the internal data to be a Set<String>, for example. If you had guaranteed to return a List<String>, then changing this later may cause some headaches.


Exposing a copy of the internal data:

A very simple solution is to just return a copy of the list:

private List<String> internalData;

List<String> getData() {
    return new ArrayList<String>(internalData);
}

This may have the drawback of (potentially large and frequent) memory copies, and thus should only be considered when the collection is "small".

Additionally, the caller will be able to modify the list, and he might expect the changes to be reflected in the internal state (which is not the case). This problem could be alleviated by additionally wrapping the new list into a Collections.unmodifiableList.


Exposing a CopyOnWriteArrayList

Exposing a CopyOnWriteArrayList via its Iterator or as an Iterable is probably not a good idea: The caller has the option to modify it via Iterator#remove calls, and you explicitly wanted to avoid this.

The solution of exposing a CopyOnWriteArrayList which is wrapped into a Collections.unmodifiableList may be an option. It may look like a superfluously thick firewall at the first glance, but it definitely could be justified - see the next paragraph.


General considerations

In any case, you should document the behavior religiously. Particularly, you should document that the caller is not supposed to change the returned data in any way (regardless of whether it is possible without causing an exception).

Beyond that, there is an uncomfortable trade-off: You can either be precise in the documentation, or avoid exposing implementation details in the documentation.

Consider the following case:

/**
 * Returns the data. The returned list is unmodifiable. 
 */
List<String> getData() {
    return Collections.unmodifiableList(internalData);
}

The documentation here should in fact also state that...

/* ...
 * The returned list is a VIEW on the internal data. 
 * Changes in the internal data will be visible in 
 * the returned list.
 */

This may be an important information, considering thread safety and the behavior during iteration. Consider a loop that iterates over the unmodifiable view on the internal data. And consider that in this loop, someone calls a function that causes a modification of the internal data:

for (String s : data.getData()) {
    ...
    data.changeInternalData();
}

This loop will break with a ConcurrentModificationException, because the internal data is modified while it is being iterated over.

The trade-off regarding the documentation here refers to the fact that, once a certain behavior is specified, clients will rely on this behavior. Imagine the client does this:

List<String> list = data.getList();
int oldSize = list.size();
data.insertElementToInternalData();

// Here, the client relies on the fact that he received
// a VIEW on the internal data:
int newSize = list.size();
assertTrue(newSize == oldSize+1);

Things like the ConcurrentModificationException could have been avoided if a true copy of the internal data had been returned, or by using a CopyOnWriteArrayList (each wrapped into a Collections.unmodifiableList). This would be the "safest" solution, in this regard:

  • The caller can not modify the returned list
  • The caller can not modify the internal state directly
  • If the caller modifies the internal state indirectly, then the iteration still works

But one has to think about whether so much "safety" is really required for the respective application case, and how this can be documented in a way that still allows changes to the internal implementation details.

like image 180
Marco13 Avatar answered Oct 20 '22 08:10

Marco13


Typically, Iterator is used only with Iterable, for the purpose of for-each loop. It'll be pretty odd to see a non-Iterable type contains a method returning Iterator, and it maybe upsetting to the user that it cannot be used in for-each loop.

So I suggest Iterable in this case. You could even have your class implements Iterable if that makes sense.

If you want to jump on the Java 8 wagon, returning a Stream probably is a more "modern" approach.

like image 1
ZhongYu Avatar answered Oct 20 '22 09:10

ZhongYu