Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Javascript `iterator.next()` return an object?

Help! I'm learning to love Javascript after programming in C# for quite a while but I'm stuck learning to love the iterable protocol!

Why did Javascript adopt a protocol that requires creating a new object for each iteration? Why have next() return a new object with properties done and value instead of adopting a protocol like C# IEnumerable and IEnumerator which allocates no object at the expense of requiring two calls (one to moveNext to see if the iteration is done, and a second to current to get the value)?

Are there under-the-hood optimizations that skip the allocation of the object return by next()? Hard to imagine given the iterable doesn't know how the object could be used once returned...

Generators don't seem to reuse the next object as illustrated below:

function* generator() {
  yield 0;
  yield 1;
}

var iterator = generator();
var result0 = iterator.next();
var result1 = iterator.next();

console.log(result0.value) // 0
console.log(result1.value) // 1

Hm, here's a clue (thanks to Bergi!):

We will answer one important question later (in Sect. 3.2): Why can iterators (optionally) return a value after the last element? That capability is the reason for elements being wrapped. Otherwise, iterators could simply return a publicly defined sentinel (stop value) after the last element.

And in Sect. 3.2 they discuss using Using generators as lightweight threads. Seems to say the reason for return an object from next is so that a value can be returned even when done is true! Whoa. Furthermore, generators can return values in addition to yield and yield*-ing values and a value generated by return ends up as in value when done is true!

And all this allows for pseudo-threading. And that feature, pseudo-threading, is worth allocating a new object for each time around the loop... Javascript. Always so unexpected!


Although, now that I think about it, allowing yield* to "return" a value to enable a pseudo-threading still doesn't justify returning an object. The IEnumerator protocol could be extended to return an object after moveNext() returns false -- just add a property hasCurrent to test after the iteration is complete that when true indicates current has a valid value...

And the compiler optimizations are non-trivial. This will result in quite wild variance in the performance of an iterator... doesn't that cause problems for library implementors?

All these points are raised in this thread discovered by the friendly SO community. Yet, those arguments didn't seem to hold the day.


However, regardless of returning an object or not, no one is going to be checking for a value after iteration is "complete", right? E.g. most everyone would think the following would log all values returned by an iterator:

function logIteratorValues(iterator) {
  var next;
  while(next = iterator.next(), !next.done)
    console.log(next.value)
}

Except it doesn't because even though done is false the iterator might still have returned another value. Consider:

function* generator() {
  yield 0;
  return 1;
}

var iterator = generator();
var result0 = iterator.next();
var result1 = iterator.next();

console.log(`${result0.value}, ${result0.done}`) // 0, false
console.log(`${result1.value}, ${result1.done}`) // 1, true

Is an iterator that returns a value after its "done" is really an iterator? What is the sound of one hand clapping? It just seems quite odd...


And here is in depth post on generators I enjoyed. Much time is spent controlling the flow of an application as opposed to iterating members of a collection.


Another possible explanation is that IEnumerable/IEnumerator requires two interfaces and three methods and the JS community preferred the simplicity of a single method. That way they wouldn't have to introduce the notion of groups of symbolic methods aka interfaces...

like image 477
Christopher King Avatar asked Dec 22 '18 09:12

Christopher King


People also ask

Does iterator return object?

Specifically, an iterator is any object which implements the Iterator protocol by having a next() method that returns an object with two properties: value. The next value in the iteration sequence.

What does next return in an iterator?

next() Return Value The next() function returns the next item from the iterator. If the iterator is exhausted, it returns the default value passed as an argument. If the default parameter is omitted and the iterator is exhausted, it raises the StopIteration exception.

What does the next method of an iterable object return?

An object becomes an iterator when it implements a next() method. The next() method must return an object with two properties: value (the next value) done (true or false)

Which method returns the iterator object itself?

The __iter__() method returns the iterator object itself. If required, some initialization can be performed. The __next__() method must return the next item in the sequence. On reaching the end, and in subsequent calls, it must raise StopIteration .


2 Answers

Are there under-the-hood optimizations that skip the allocation of the object return by next()?

Yes. Those iterator result objects are small and usually short-lived. Particularly in for … of loops, the compiler can do a trivial escape analysis to see that the object doesn't face the user code at all (but only the internal loop evaluation code). They can be dealt with very efficiently by the garbage collector, or even be allocated directly on the stack.

Here are some sources:

  • JS inherits it functionally-minded iteration protocol from Python, but with results objects instead of the previously favoured StopIteration exceptions
  • Performance concerns in the spec discussion (cont'd) were shrugged off. If you implement a custom iterator and it is too slow, try using a generator function
  • (At least for builtin iterators) these optimisations are already implemented:

    The key to great performance for iteration is to make sure that the repeated calls to iterator.next() in the loop are optimized well, and ideally completely avoid the allocation of the iterResult using advanced compiler techniques like store-load propagation, escape analysis and scalar replacement of aggregates. To really shine performance-wise, the optimizing compiler should also completely eliminate the allocation of the iterator itself - the iterable[Symbol.iterator]() call - and operate on the backing-store of the iterable directly.

like image 130
Bergi Avatar answered Oct 02 '22 10:10

Bergi


Bergi answered already, and I've upvoted, I just want to add this:

Why should you even be concerned about new object being returned? It looks like:

{done: boolean, value: any}

You know, you are going to use the value anyway, so it's really not an extra memory overhead. What's left? done: boolean and the object itself take up to 8 bytes each, which is the smallest addressable memory possible and must be processed by the cpu and allocated in memory in a few pico- or nanoseconds (I think it's pico- given the likely-existing v8 optimizations). Now if you still care about wasting that amount of time and memory, than you really should consider switching to something like Rust+WebAssembly from JS.

like image 42
Nurbol Alpysbayev Avatar answered Oct 02 '22 10:10

Nurbol Alpysbayev