I wrote a function to return a generator containing every unique combination of sub-strings a given length that contain more than n elements from a primary string.
As an illustration:
if i have 'abcdefghi' and a probe of length of two, and a threshold of 4 elements per list i'd like to get:
['ab', 'cd', 'ef', 'gh']
['ab', 'de', 'fg', 'hi']
['bc', 'de', 'fg', 'hi']
My first attempt at this problem involved returning a list of lists. This ended up overflowing the memory of the computer. As a crude secondary solution, I created a generator that does something similar. The problem is that I created a nested generator that calls itself. When I run this function, it seems to just loop around the inner for loop without actually calling itself again. I thought that a generator would precede as far down the recursion hole as necessary until it hit the yield statement. Any clue what is happening?
def get_next_probe(self, current_probe_list, probes, unit_length):
if isinstance(current_probe_list, list):
last_probe=current_probe_list[-1]
available_probes = [candidate for candidate in probes if candidate.start>last_probe.end]
else:
available_probes = [candidate for candidate in probes if candidate.start<unit_length]
if available_probes:
max_position=min([probe.end for probe in available_probes])
available_probes2=[probe for probe in available_probes if max_position+1>probe.start]
for new_last_probe in available_probes2:
new_list=list(current_probe_list)
new_list.append(new_last_probe)
self.get_next_probe(new_list, probes, unit_length)
else:
if len(current_probe_list)>=self.num_units:
yield current_probe_list
If yield is changed to print this works just fine! I'd appreciate any help I could get. I realize this isn't an optimal implementation of this type of search problem, it seems like returning a list of found positions from the last call of get_next_probe and filtering this list for the elements that do not overlap new_last_probe.end would be far more efficient... but this was a lot easier for me to write. Any algorithm input would still be appreciated.
Thanks!
I thought that a generator would precede as far down the recursion hole as necessary until it hit the yield statement
It will recurse fine, but to get the yield
ed value to propogate back outward, you need to do it explicitly - just like if it was a return
, you would need to explicitly return
the result of each recursion. So, instead of:
self.get_next_probe(new_list, probes, unit_length)
You would do something like:
for val in self.get_next_probe(new_list, probes, unit_length):
yield val
Or if you're using Python 3.3 or newer, you can also do this:
yield from self.get_next_probe(new_list, probes, unit_length)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With