Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does functools.lru_cache break this function?

Consider the following function, which returns all the unique permutations of a set of elements:

def get_permutations(elements):
    if len(elements) == 0:
        yield ()
    else:
        unique_elements = set(elements)
        for first_element in unique_elements:
            remaining_elements = list(elements)
            remaining_elements.remove(first_element)
            for subpermutation in get_permutations(tuple(remaining_elements)):
                yield (first_element,) + subpermutation

for permutation in get_permutations((1, 1, 2)):
    print(permutation)

This prints

(1, 1, 2)
(1, 2, 1)
(2, 1, 1)

as expected. However, when I add the lru_cache decorator, which memoizes the function:

import functools

@functools.lru_cache(maxsize=None)
def get_permutations(elements):
    if len(elements) == 0:
        yield ()
    else:
        unique_elements = set(elements)
        for first_element in unique_elements:
            remaining_elements = list(elements)
            remaining_elements.remove(first_element)
            for subpermutation in get_permutations(tuple(remaining_elements)):
                yield (first_element,) + subpermutation

for permutation in get_permutations((1, 1, 2)):
    print(permutation)

it prints the following:

(1, 1, 2)

Why is it only printing the first permutation?

like image 590
user76284 Avatar asked Apr 22 '16 21:04

user76284


1 Answers

lru.cache memoizes the return value of your function. Your function returns a generator. Generators have state and can be exhausted (i.e., you come to the end of them and no more items are yielded). Unlike the undecorated version of the function, the LRU cache gives you the exact same generator object each time the function is called with a given set of arguments. It had better, because that's what it's for!

But some of the generators you're caching are used more than once and are partially or completely exhausted when they are used the second and subsequent times. (They may even be "in play" more than once simultaneously.)

To explain the result you're getting, consider what happens when the length of elements is 0 and you yield ()... the first time. The next time this generator is called, it is already at the end and doesn't yield anything at all. Thus your subpermutation loop does nothing and nothing further is yielded from it. As this is the "bottoming out" case in your recursion, it is vital to the program working, and losing it breaks the program's ability to yield the values you expect.

The generator for (1,) is also used twice, and this breaks the third result before it even gets down to ().

To see what's happening, add a print(elements) as the first line in your function (and add some kind of marker to the print call in the main for loop, so you can tell the difference). Then compare the output of the memoized version vs the original.

It seems like you probably want some way to memoize the result of a generator. What you want to do in that case is write it as a function that returns a list with all the items (rather than yielding an item ts a time) and memoize that.

like image 118
kindall Avatar answered Sep 20 '22 20:09

kindall