In querying an API that has a paginated list of unknown length I found myself doing essentially
def fetch_one(self, n):
data = json.load(urlopen(url_template % n))
if data is None:
self.finished = True
return
for row in data:
if row_is_weird(row):
self.finished = True
return
yield prepare(row)
def work(self):
n = 1
self.finished = False
while not self.finished:
consume(self.fetch_one(n))
n += 1
the split between work
and fetch_one
makes it very easy to test, but the signalling via instance variables means I can't have more than one work
going on at the same time, which sucks. I came up with what I think is a cleaner solution, but it involves an iterator with two "done" states, and I have no idea what to call it. I'm sure this pattern exists elsewhere, so I'd appreciate pointers (or reasons why this is stupid):
class Thing(object):
def __init__(self, gen):
self.gen = gen
self.finished = False
def __iter__(self):
return self
def __next__(self):
try:
v = next(self.gen)
except StopThisThing:
self.finished = True
raise StopIteration
else:
return v
next = __next__
which I'd then use like
@thinged
def fetch_one(self, n):
data = json.load(urlopen(url_template % n))
if data is None:
raise StopThisThing()
for row in data:
if row_is_weird(row):
raise StopThisThing()
yield prepare(row)
def work(self):
n = 1
while True:
one = self.fetch_one(n)
consume(one)
if one.finished:
break
n += 1
so what is this Thing I have created?
To iterate, hasNext() and next() methods are used in a loop. Classes that implement the Iterable interface need to override the iterator() method. Classes that implement Iterator interface need to override hasNext(), next() and remove() methods.
Iterators are used to traverse from one element to another element, a process is known as iterating through the container. The main advantage of an iterator is to provide a common interface for all the containers type. Iterators make the algorithm independent of the type of the container used.
Iterators also allow for individual access to each member of the group, without affecting the rest of the group. They are used in many scripting and programming languages, including C++, Java, PHP, and Perl. Their implementation is independent of the objects they scan, so they can scan any type of a group of objects.
I think that you can avoid that by yielding something special.
I had to build my own runnable example, to show what I mean:
def fetch_one(n):
lst = [[1,2,3], [4,5,6], [7,8,9]][n]
for x in lst:
if x == 6:
yield 'StopAll'
return
yield x
def work():
n = 0
in_progress = True
while in_progress:
numbers_iterator = fetch_one(n)
for x in numbers_iterator:
if x == 'StopAll':
in_progress = False
break
print('x =', x)
n += 1
work()
Output:
x = 1
x = 2
x = 3
x = 4
x = 5
I like this more than self.finished
or a decorator like the one you built, but I think that something better could still be found. (Maybe this answer could help you with that).
Update: A much simplier solution might be to transform fetch_one
into a class that carries its own finised
flag.
A decorator approach to this solution might be:
class stopper(object):
def __init__(self, func):
self.func = func
self.finished = False
def __call__(self, *args, **kwargs):
for x in self.func(*args, **kwargs):
if x == 6:
self.finished = True
raise StopIteration
yield x
else:
self.finished = True
Basically you don't care anymore how fetch_one
works, only if what yields is ok or not.
Usage example:
@stopper
def fetch_one(n):
lst = [[1,2,3], [4,5,6], [7,8,9]][n]
#lst = [[1,2,3], [], [4,5,6], [7,8,9]][n] # uncomment to test for/else
for x in lst:
yield x
def work():
n = 0
while not fetch_one.finished:
for x in fetch_one(n):
print('x =', x)
n += 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With