I'm writing GC friendly code to read and return to the user a series of <code>byte[]</code> messages. Internally I reuse the same <code>ByteBuffer</code> which means I'll repeatedly return the same <code>byte[]</code> instance most of the time. I'm considering writing cautionary javadoc and exposing this to the user as a <code>Iterator<byte[]></code>. AFAIK it won't violate the <code>Iterator</code> contract, but the user certainly could be surprised if they do <code>Lists.newArrayList(myIterator)</code> and get back a <code>List</code> populated with the same <code>byte[]</code> in each position! The question: is it bad practice for a class that may mutate and return the same object to implement the <code>Iterator</code> interface? <ul> <li>If so, what is the best alternative? "Don't mutate/reuse your objects" is an easy answer. But it doesn't address the cases when reuse is very desirable.</li> <li>If not, how do you justify violating the principle of least astonishment?</li> </ul> Two minor notes: <ul> <li>I'm using Guava's <code>AbstractIterator</code> so remove() isn't really of concern.</li> <li>In my use case the user is me and the visibility of this class will be limited, but I've tried to ask this generally enough to apply more broadly.</li> </ul> Update: I'm accepting Louis' answer because it has 3x more votes than Keith's, but note that in my use case I'm planning to take the code that I left in a comment on Keith's answer to production.

<code>EnumMap</code> did essentially exactly this in its <code>entrySet()</code> iterator, which causes confusing, crazy, depressing bugs to this day. If I were you, I just wouldn't use an <code>Iterator</code> -- I'd write a different API (possibly quite dissimilar from Iterator, even) and implement that. For example, you might write a new API that takes as input the <code>ByteBuffer</code> to write the message into, so users of the API could control whether or not the buffer gets reused. That seems reasonably intuitive (the user can write code that obviously and cleanly reuses the <code>ByteBuffer</code>), without creating unnecessarily cluttered code.

I would define an intermediate object which you can invalidate. So your function would return an <code>Iterator<ByteArray></code>, and <code>ByteArray</code> is something like this: <pre class="prettyprint"><code>class ByteArray { private byte[] data; ByteArray(byte[] d) { data = d; } byte[] getData() { if (data == null) throw new BadUseOfIteratorException(); return data; } void invalidate() { data = null; } } </code></pre> Then your iterator can invalidate the previously returned <code>ByteArray</code> so that any future access (via <code>getData</code>, or any other accessor you provide) will fail. Then at least if someone does something like <code>Lists.newArrayList(myIterator)</code>, they will at least get an error (when the first invalid <code>ByteArray</code> is accessed) instead of silently returning the wrong data. Of course, this won't catch all possible bad uses, but probably the common ones. If you're happy with never returning the raw <code>byte[]</code> and providing accessors like <code>byte get(int idx)</code> instead, then it should catch all cases. You will have to allocate a new <code>ByteArray</code> for each iterator return, but hopefully that's a lot less expensive than copying your <code>byte[]</code> for each iterator return.

An Iterator which mutates and returns the same object. Bad practice?

I'm writing GC friendly code to read and return to the user a series of byte[] messages. Internally I reuse the same ByteBuffer which means I'll repeatedly return the same byte[] instance most of the time.

I'm considering writing cautionary javadoc and exposing this to the user as a Iterator<byte[]>. AFAIK it won't violate the Iterator contract, but the user certainly could be surprised if they do Lists.newArrayList(myIterator) and get back a List populated with the same byte[] in each position!

The question: is it bad practice for a class that may mutate and return the same object to implement the Iterator interface?

If so, what is the best alternative? "Don't mutate/reuse your objects" is an easy answer. But it doesn't address the cases when reuse is very desirable.
If not, how do you justify violating the principle of least astonishment?

Two minor notes:

I'm using Guava's AbstractIterator so remove() isn't really of concern.
In my use case the user is me and the visibility of this class will be limited, but I've tried to ask this generally enough to apply more broadly.

Update: I'm accepting Louis' answer because it has 3x more votes than Keith's, but note that in my use case I'm planning to take the code that I left in a comment on Keith's answer to production.

Does iterator return object?

Specifically, an iterator is any object which implements the Iterator protocol by having a next() method that returns an object with two properties: value. The next value in the iteration sequence.

Can iterators be reused?

iterators are not reusable; you need to get a fresh Iterator from the Iterable collection each time you want to iterate over the elements.

What does it mean to return an iterator?

Returing an iterator means returning an instance of a class that implements the Iterator interface. This class has to implement hasNext() , next() and remove() .

What are advantages of iterating a collection using iterator?

Advantages of Iterator in Java Iterator in Java supports both read as well as remove operations. If you are using for loop you cannot update(add/remove) the Collection whereas with the help of an iterator you can easily update Collection. It is a Universal Cursor for the Collection API.

EnumMap did essentially exactly this in its entrySet() iterator, which causes confusing, crazy, depressing bugs to this day.

If I were you, I just wouldn't use an Iterator -- I'd write a different API (possibly quite dissimilar from Iterator, even) and implement that. For example, you might write a new API that takes as input the ByteBuffer to write the message into, so users of the API could control whether or not the buffer gets reused. That seems reasonably intuitive (the user can write code that obviously and cleanly reuses the ByteBuffer), without creating unnecessarily cluttered code.

I would define an intermediate object which you can invalidate. So your function would return an Iterator<ByteArray>, and ByteArray is something like this:

class ByteArray {
    private byte[] data;
    ByteArray(byte[] d) { data = d; }
    byte[] getData() {
        if (data == null) throw new BadUseOfIteratorException();
        return data;
    }
    void invalidate() { data = null; }
}

Then your iterator can invalidate the previously returned ByteArray so that any future access (via getData, or any other accessor you provide) will fail. Then at least if someone does something like Lists.newArrayList(myIterator), they will at least get an error (when the first invalid ByteArray is accessed) instead of silently returning the wrong data.

Of course, this won't catch all possible bad uses, but probably the common ones. If you're happy with never returning the raw byte[] and providing accessors like byte get(int idx) instead, then it should catch all cases.

You will have to allocate a new ByteArray for each iterator return, but hopefully that's a lot less expensive than copying your byte[] for each iterator return.

An Iterator which mutates and returns the same object. Bad practice?

Tags:

java

iterator

guava

Brian Harris

People also ask

2 Answers

Louis Wasserman

Keith Randall

Recent Activity

Donate For Us

An Iterator which mutates and returns the same object. Bad practice?

Tags:

java

iterator

guava

Brian Harris

People also ask

2 Answers

Louis Wasserman

Keith Randall

Related questions

Recent Activity

Donate For Us