I have an iterator it
that I'm assuming already sorted but I would like to raise an exception if it isn't.
Data from iterator is not in memory so I do not want to use sorted()
builtin because AFAIK it puts the whole iterator in a list.
The solution I'm using now is to wrap the iterator in a generator function like this:
def checkSorted(it):
prev_v = it.next()
yield prev_v
for v in it:
if v >= prev_v:
yield v
prev_v = v
else:
raise ValueError("Iterator is not sorted")
So that I can use it like this:
myconsumer(checkSorted(it))
Does someone know if there are better solutions?
I know that my solution works but it seems quite strange (at least to me) writing a module on my own to accomplish such a trivial task. I'm looking for a simple one liner or builtin solution (If it exists)
Basically your solution is almost as elegant as it gets (you could of course put it in an utility module if you find it generally useful). You could if you wanted it use an infinity object to cut the code down a bit, but then you have to include a class definition as well which grows the code again (unless you inline the class definition):
def checkSorted(it):
prev = type("", (), {"__lt__": lambda a, b: False})()
for x in it:
if prev < x:
raise ValueError("Not sorted")
prev = x
yield x
The first line is using the type
to first create a class and then instantiate it. Objects of this class compares less than to anything (infinity object).
The problem with doing a one-liner is that you have to deal with three constructs: you have to update state (assignment), throw an exception and doing a loop. You could easily perform these by using statements, but making them into a oneliner will mean that you will have to try to put the statements on the same line - which in turn will result in problem with the loop and if
-constructs.
If you want to put the whole thing into an expression you will have to use dirty tricks to do these, the assignment and looping the iterutils
can provide and the throwing can be done by using the throw
method in a generator (which can be provided in an expression too):
imap( lambda i, j: (i >= j and j or (_ for _ in ()).throw(ValueError("Not sorted"))), *(lambda pre, it: (chain([type("", (), {"__ge__": lambda a, b: True})()], pre), it))(*tee(it)))
The last it
is the iterator you want to check and the expression evaluates to a checked iterator. I agree it's not good looking and not obvious what it does, but you asked for it (and I don't think you wanted it).
As an alternative i suggest to use itertools.izip_longest
(and zip_longest
in python 3 )to create a generator contains consecutive pairs :
You can use tee
to create 2 independent iterators from a first iterable.
from itertools import izip_longest,tee
def checkSorted(it):
pre,it=tee(it)
next(it)
for i,j in izip_longest(pre,it):
if j:
if i >= j:
yield i
else:
raise ValueError("Iterator is not sorted")
else :
yield i
Demo :
it=iter([5,4,3,2,1])
print list(checkSorted(it))
[5, 4, 3, 2, 1]
it=iter([5,4,3,2,3])
print list(checkSorted(it))
Traceback (most recent call last):
File "/home/bluebird/Desktop/ex2.py", line 19, in <module>
print list(checkSorted(it))
File "/home/bluebird/Desktop/ex2.py", line 10, in checkSorted
raise ValueError("Iterator is not sorted")
ValueError: Iterator is not sorted
Note : Actually I think there is no need to yield the values of your iterable wen you have them already.So as a more elegant way I suggest to use a generator expression within all
function and return a bool value :
from itertools import izip,tee
def checkSorted(it):
pre,it=tee(it)
next(it)
return all(i>=j for i,j in izip(pre,it))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With