I am working on a problem that involves validating a format from within unified diff patch.
The variables within the inner format can span multiple lines at a time, so I wrote a generator that pulls each line and yields the variable when it is complete.
To avoid having to rewrite this function when reading from a unified diff file, I created a generator to strip the unified diff characters from the line before passing it to the inner format validator. However, I am getting stuck in an infinite loop (both in the code and in my head). I have abstracted to problem to the following code. I'm sure there is a better way to do this. I just don't know what it is.
from collections import Iterable
def inner_format_validator(inner_item):
# Do some validation to inner items
return inner_item[0] != '+'
def inner_gen(iterable):
for inner_item in iterable:
# Operates only on inner_info type data
yield inner_format_validator(inner_item)
def outer_gen(iterable):
class DecoratedGenerator(Iterable):
def __iter__(self):
return self
def next(self):
# Using iterable from closure
for outer_item in iterable:
self.outer_info = outer_item[0]
inner_item = outer_item[1:]
return inner_item
decorated_gen = DecoratedGenerator()
for inner_item in inner_gen(decorated_gen):
yield inner_item, decorated_gen.outer_info
if __name__ == '__main__':
def wrap(string):
# The point here is that I don't know what the first character will be
pseudo_rand = len(string)
if pseudo_rand * pseudo_rand % 2 == 0:
return '+' + string
else:
return '-' + string
inner_items = ["whatever"] * 3
# wrap screws up inner_format_validator
outer_items = [wrap("whatever")] * 3
# I need to be able to
# iterate over inner_items
for inner_info in inner_gen(inner_items):
print(inner_info)
# and iterate over outer_items
for outer_info, inner_info in outer_gen(outer_items):
# This is an infinite loop
print(outer_info)
print(inner_info)
Any ideas as to a better, more pythonic way to do this?
I would do something simpler, like this:
def outer_gen(iterable):
iterable = iter(iterable)
first_item = next(iterable)
info = first_item[0]
yield info, first_item[1:]
for item in iterable:
yield info, item
This will execute the 4 first lines only once, then enter the loop and yield what you want.
You probably want to add some try
/except
to cacth IndexErrors
here and there.
If you want to take values while they start with something or the contrary, remember you can use a lot of stuff from the itertools
toolbox, and in particular dropwhile
, takewhile
and chain
:
>>> import itertools
>>> l = ['+foo', '-bar', '+foo']
>>> list(itertools.takewhile(lambda x: x.startswith('+'), l))
['+foo']
>>> list(itertools.dropwhile(lambda x: x.startswith('+'), l))
['-bar', '+foo']
>>> a = itertools.takewhile(lambda x: x.startswith('+'), l)
>>> b = itertools.dropwhile(lambda x: x.startswith('+'), l)
>>> list(itertools.chain(a, b))
['+foo', '-bar', '+foo']
And remember that you can create generators like comprehension lists, store them in variables and chain them, just like you would pipe linux commands:
import random
def create_item():
return random.choice(('+', '-')) + random.choice(('foo', 'bar'))
random_items = (create_item() for s in xrange(10))
added_items = ((i[0], i[1:]) for i in random_items if i.startswith('+'))
valid_items = ((prefix, line) for prefix, line in added_items if 'foo' in line)
print list(valid_items)
With all this, you should be able to find some pythonic way to solve your problem :-)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With