Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python 3.3's yield from

Python 3 brings the yield from semantics. As far as I understand it's supposed to yield to the outermost generator in which case I'd expect this code to be linear in N.

from collections import Iterable

def flatten(L):
  for e in L:
    if isinstance(e, Iterable):
      yield from flatten(e)
    else:
      yield e 

N = 100
L = [-1]
for i in range(N):
  L = [i, [L], i]
for i in range(100):
  f = list(flatten(L))
print(len(f))

If I set N=200 however the computation time is about four times longer suggesting flatten is quadratic in the length of L. I can't understand why it would be since the code visits each element only once and the yield from keyword should be sending values directly from the inner generator to where the terms are collected into the list.

Is this a bug, simply not intended, or am I using it wrong? Is there a nice way to do O(N) flattening in Python?

like image 737
LordLing Avatar asked Sep 08 '12 13:09

LordLing


People also ask

What is yield from in Python?

What Is Yield In Python? The Yield keyword in Python is similar to a return statement used for returning values or objects in Python. However, there is a slight difference. The yield statement returns a generator object to the one who calls the function which contains yield, instead of simply returning a value.

What is the difference between yield and yield from?

yield from yields each item of the iterable, but yield yields the iterable itself.

How do you yield a generator in Python?

You can assign this generator to a variable in order to use it. When you call special methods on the generator, such as next() , the code within the function is executed up to yield . When the Python yield statement is hit, the program suspends function execution and returns the yielded value to the caller.

What is the difference between yield and return Python?

The yield statement hauls the function and returns back the value to the function caller and restart from where it is left off. The yield statement can be called multiple times. While the return statement ends the execution of the function and returns the value back to the caller.


1 Answers

yield from, just like for item in x: yield x, is linear. However, function calls are slow, and because of the nesting in your l, when you double N, you're not merely doubling the number of terms, you're doubling the number of calls needed. Anything which scales with the number of calls, like function overhead itself esp. due to the recursion, any yield from overhead, for loop initialization overhead, whatever, would therefore cause a problem. If this is right, then we'd expect that a list with the same number of elements and no nesting would be linear, mid-nesting would be in-between, and lots of nesting would be super-slow. And that's what we see:

import timeit

s = """

from collections import Iterable

def flatten(l):
   for e in l:
       if isinstance(e, Iterable):
           yield from flatten(e)
       else:
           yield e 

def build_lots(N):
    l = [-1]
    for i in range(N):
        l = [i, [l], i]
    return l

def build_some(N):
    l = [-1]
    for i in range(N):
        l = [i]+l+[i] if i % 2 else [i,[l],i]
    return l

def build_none(N):
    return range(2*N+1)

"""

def get_time(size, which_build, n=100):
    setup = s + "l = build_{}({});".format(which_build, size)
    ans = timeit.Timer(setup=setup, stmt="z=list(flatten(l))").timeit(n)
    return ans

print('length', 'none','some','lots')
for w in range(0, 500, 50):
    print(2*w+1, 
          get_time(w, 'none'), 
          get_time(w, 'some'),
          get_time(w, 'lots'))

produces

length none some lots
1 0.0012789969332516193 0.0006600483320653439 0.000653265044093132
101 0.030214487109333277 0.06863495009019971 0.10554128512740135
201 0.05980244372040033 0.188231083098799 0.3237948380410671
301 0.08960435865446925 0.3752179825678468 0.6493003228679299
401 0.11986956000328064 0.6066137161105871 1.147628225851804
501 0.14737469609826803 0.9323012446984649 1.7087256000377238
601 0.18555369088426232 1.2575508910231292 2.2957410947419703
701 0.20820995513349771 1.712264522910118 3.094764341134578
801 0.23618148919194937 2.100640726275742 4.079551971051842
901 0.26863432209938765 2.617169467266649 4.858607416041195

simple plot of time vs nesting

like image 189
DSM Avatar answered Sep 28 '22 17:09

DSM