Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recursive Generators in Python

I wrote a function to return a generator containing every unique combination of sub-strings a given length that contain more than n elements from a primary string.

As an illustration:

if i have 'abcdefghi' and a probe of length of two, and a threshold of 4 elements per list i'd like to get:

['ab', 'cd', 'ef', 'gh']
['ab', 'de', 'fg', 'hi']
['bc', 'de', 'fg', 'hi']

My first attempt at this problem involved returning a list of lists. This ended up overflowing the memory of the computer. As a crude secondary solution, I created a generator that does something similar. The problem is that I created a nested generator that calls itself. When I run this function, it seems to just loop around the inner for loop without actually calling itself again. I thought that a generator would precede as far down the recursion hole as necessary until it hit the yield statement. Any clue what is happening?

def get_next_probe(self, current_probe_list, probes, unit_length):
    if isinstance(current_probe_list, list):
        last_probe=current_probe_list[-1]
        available_probes = [candidate for candidate in probes if candidate.start>last_probe.end]
    else:
        available_probes = [candidate for candidate in probes if candidate.start<unit_length]

    if available_probes:

        max_position=min([probe.end for probe in available_probes])
        available_probes2=[probe for probe in available_probes if max_position+1>probe.start]

        for new_last_probe in available_probes2:
            new_list=list(current_probe_list)
            new_list.append(new_last_probe)
            self.get_next_probe(new_list, probes, unit_length)

    else:
        if len(current_probe_list)>=self.num_units:
            yield current_probe_list

If yield is changed to print this works just fine! I'd appreciate any help I could get. I realize this isn't an optimal implementation of this type of search problem, it seems like returning a list of found positions from the last call of get_next_probe and filtering this list for the elements that do not overlap new_last_probe.end would be far more efficient... but this was a lot easier for me to write. Any algorithm input would still be appreciated.

Thanks!

like image 668
user1348971 Avatar asked Apr 22 '12 01:04

user1348971


1 Answers

I thought that a generator would precede as far down the recursion hole as necessary until it hit the yield statement

It will recurse fine, but to get the yielded value to propogate back outward, you need to do it explicitly - just like if it was a return, you would need to explicitly return the result of each recursion. So, instead of:

 self.get_next_probe(new_list, probes, unit_length)

You would do something like:

 for val in self.get_next_probe(new_list, probes, unit_length):
     yield val

Or if you're using Python 3.3 or newer, you can also do this:

yield from self.get_next_probe(new_list, probes, unit_length)
like image 66
lvc Avatar answered Sep 20 '22 15:09

lvc