I'm writing GC friendly code to read and return to the user a series of byte[]
messages. Internally I reuse the same ByteBuffer
which means I'll repeatedly return the same byte[]
instance most of the time.
I'm considering writing cautionary javadoc and exposing this to the user as a Iterator<byte[]>
. AFAIK it won't violate the Iterator
contract, but the user certainly could be surprised if they do Lists.newArrayList(myIterator)
and get back a List
populated with the same byte[]
in each position!
The question: is it bad practice for a class that may mutate and return the same object to implement the Iterator
interface?
If so, what is the best alternative? "Don't mutate/reuse your objects" is an easy answer. But it doesn't address the cases when reuse is very desirable.
If not, how do you justify violating the principle of least astonishment?
Two minor notes:
I'm using Guava's AbstractIterator
so remove() isn't really of concern.
In my use case the user is me and the visibility of this class will be limited, but I've tried to ask this generally enough to apply more broadly.
Update: I'm accepting Louis' answer because it has 3x more votes than Keith's, but note that in my use case I'm planning to take the code that I left in a comment on Keith's answer to production.
Specifically, an iterator is any object which implements the Iterator protocol by having a next() method that returns an object with two properties: value. The next value in the iteration sequence.
iterators are not reusable; you need to get a fresh Iterator from the Iterable collection each time you want to iterate over the elements.
Returing an iterator means returning an instance of a class that implements the Iterator interface. This class has to implement hasNext() , next() and remove() .
Advantages of Iterator in Java Iterator in Java supports both read as well as remove operations. If you are using for loop you cannot update(add/remove) the Collection whereas with the help of an iterator you can easily update Collection. It is a Universal Cursor for the Collection API.
EnumMap
did essentially exactly this in its entrySet()
iterator, which causes confusing, crazy, depressing bugs to this day.
If I were you, I just wouldn't use an Iterator
-- I'd write a different API (possibly quite dissimilar from Iterator, even) and implement that. For example, you might write a new API that takes as input the ByteBuffer
to write the message into, so users of the API could control whether or not the buffer gets reused. That seems reasonably intuitive (the user can write code that obviously and cleanly reuses the ByteBuffer
), without creating unnecessarily cluttered code.
I would define an intermediate object which you can invalidate. So your function would return an Iterator<ByteArray>
, and ByteArray
is something like this:
class ByteArray {
private byte[] data;
ByteArray(byte[] d) { data = d; }
byte[] getData() {
if (data == null) throw new BadUseOfIteratorException();
return data;
}
void invalidate() { data = null; }
}
Then your iterator can invalidate the previously returned ByteArray
so that any future access (via getData
, or any other accessor you provide) will fail. Then at least if someone does something like Lists.newArrayList(myIterator)
, they will at least get an error (when the first invalid ByteArray
is accessed) instead of silently returning the wrong data.
Of course, this won't catch all possible bad uses, but probably the common ones. If you're happy with never returning the raw byte[]
and providing accessors like byte get(int idx)
instead, then it should catch all cases.
You will have to allocate a new ByteArray
for each iterator return, but hopefully that's a lot less expensive than copying your byte[]
for each iterator return.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With