Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a macro for creating fast Iterators from generator-like functions in julia?

Coming from python3 to Julia one would love to be able to write fast iterators as a function with produce/yield syntax or something like that.

Julia's macros seem to suggest that one could build a macro which transforms such a "generator" function into an julia iterator. [It even seems like you could easily inline iterators written in function style, which is a feature the Iterators.jl package also tries to provide for its specific iterators https://github.com/JuliaCollections/Iterators.jl#the-itr-macro-for-automatic-inlining-in-for-loops ]

Just to give an example of what I have in mind:

@asiterator function myiterator(as::Array)
  b = 1
  for (a1, a2) in zip(as, as[2:end])
    try
      @produce a1[1] + a2[2] + b
    catch exc
    end
  end
end

for i in myiterator([(1,2), (3,1), 3, 4, (1,1)])
   @show i
end

where myiterator should ideally create a fast iterator with as low overhead as possible. And of course this is only one specific example. I ideally would like to have something which works with all or almost all generator functions.

The currently recommended way to transform a generator function into an iterator is via Julia's Tasks, at least to my knowledge. However they also seem to be way slower then pure iterators. For instance if you can express your function with the simple iterators like imap, chain and so on (provided by Iterators.jl package) this seems to be highly preferable.

Is it theoretically possible in julia to build a macro converting generator-style functions into flexible fast iterators?

Extra-Point-Question: If this is possible, could there be a generic macro which inlines such iterators?

like image 538
schlichtanders Avatar asked Apr 23 '17 19:04

schlichtanders


3 Answers

After thinking a lot how to translate python generators to Julia without loosing much performance, I implemented and tested a library of higher level functions which implement Python-like/Task-like generators in a continuation-style. https://github.com/schlichtanders/Continuables.jl

Essentially, the idea is to regard Python's yield / Julia's produce as a function which we take from the outside as an extra parameter. I called it cont for continuation. Look for instance on this reimplementation of a range

crange(n::Integer) = cont -> begin
  for i in 1:n
    cont(i)
  end
end

You can simply sum up all integers by the following code

function sum_continuable(continuable)
  a = Ref(0)
  continuable() do i
    a.x += i
  end
  a.x
end

# which simplifies with the macro Continuables.@Ref to
@Ref function sum_continuable(continuable)
  a = Ref(0)
  continuable() do i
    a += i
  end
  a
end

sum_continuable(crange(4))  # 10

As you hopefully agree, you can work with continuables almost like you would have worked with generators in python or tasks in julia. Using do notation instead of for loops is kind of the one thing you have to get used to.

This idea takes you really really far. The only standard method which is not purely implementable using this idea is zip. All the other standard higher-level tools work just like you would hope.

The performance is unbelievably faster than Tasks and even faster than Iterators in some cases (notably the naive implementation of Continuables.cmap is orders of magnitude faster than Iterators.imap). Check out the Readme.md of the github repository https://github.com/schlichtanders/Continuables.jl for more details.


EDIT: To answer my own question more directly, there is no need for a macro @asiterator, just use continuation style directly.

mycontinuable(as::Array) = cont -> begin
  b = 1
  for (a1, a2) in zip(as, as[2:end])
    try
      cont(a1[1] + a2[2] + b)
    catch exc
    end
  end
end

mycontinuable([(1,2), (3,1), 3, 4, (1,1)]) do i
   @show i
end
like image 77
schlichtanders Avatar answered Nov 18 '22 23:11

schlichtanders


Some iterators of this form can be written like this:

myiterator(as) = (a1[1] + a2[2] + 1 for (a1, a2) in zip(as, as[2:end]))

This code can (potentially) be inlined.

To fully generalize this, it is in theory possible to write a macro that converts its argument to continuation-passing style (CPS), making it possible to suspend and restart execution, giving something like an iterator. Delimited continuations are especially appropriate for this (https://en.wikipedia.org/wiki/Delimited_continuation). The result is a big nest of anonymous functions, which might be faster than Task switching, but not necessarily, since at the end of the day it needs to heap-allocate a similar amount of state.

I happen to have an example of such a transformation here (in femtolisp though, not Julia): https://github.com/JeffBezanson/femtolisp/blob/master/examples/cps.lsp This ends with a define-generator macro that does what you describe. But I'm not sure it's worth the effort to do this for Julia.

like image 42
Jeff Bezanson Avatar answered Nov 18 '22 22:11

Jeff Bezanson


Python-style generators – which in Julia would be closest to yielding from tasks – involve a fair amount of inherent overhead. You have to switch tasks, which is non-trivial and cannot straightforwardly be eliminated by a compiler. That's why Julia's iterators are based on functions that transform one typically immutable, simple state value, and another. Long story short: no, I do not believe that this transformation can be done automatically.

like image 2
StefanKarpinski Avatar answered Nov 18 '22 23:11

StefanKarpinski