Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Check if iterator is sorted

I have an iterator it that I'm assuming already sorted but I would like to raise an exception if it isn't.

Data from iterator is not in memory so I do not want to use sorted() builtin because AFAIK it puts the whole iterator in a list.

The solution I'm using now is to wrap the iterator in a generator function like this:

def checkSorted(it):
  prev_v = it.next()
  yield prev_v
  for v in it:
    if v >= prev_v:
      yield v
      prev_v = v
    else:
      raise ValueError("Iterator is not sorted")

So that I can use it like this:

myconsumer(checkSorted(it))

Does someone know if there are better solutions?

I know that my solution works but it seems quite strange (at least to me) writing a module on my own to accomplish such a trivial task. I'm looking for a simple one liner or builtin solution (If it exists)

like image 577
Zac Avatar asked Jun 24 '15 08:06

Zac


2 Answers

Basically your solution is almost as elegant as it gets (you could of course put it in an utility module if you find it generally useful). You could if you wanted it use an infinity object to cut the code down a bit, but then you have to include a class definition as well which grows the code again (unless you inline the class definition):

def checkSorted(it):
    prev = type("", (), {"__lt__": lambda a, b: False})()

    for x in it:
        if prev < x:
            raise ValueError("Not sorted")
        prev = x
        yield x

The first line is using the type to first create a class and then instantiate it. Objects of this class compares less than to anything (infinity object).

The problem with doing a one-liner is that you have to deal with three constructs: you have to update state (assignment), throw an exception and doing a loop. You could easily perform these by using statements, but making them into a oneliner will mean that you will have to try to put the statements on the same line - which in turn will result in problem with the loop and if-constructs.

If you want to put the whole thing into an expression you will have to use dirty tricks to do these, the assignment and looping the iterutils can provide and the throwing can be done by using the throw method in a generator (which can be provided in an expression too):

imap( lambda i, j: (i >= j and j or (_ for _ in ()).throw(ValueError("Not sorted"))), *(lambda pre, it: (chain([type("", (), {"__ge__": lambda a, b: True})()], pre), it))(*tee(it)))

The last it is the iterator you want to check and the expression evaluates to a checked iterator. I agree it's not good looking and not obvious what it does, but you asked for it (and I don't think you wanted it).

like image 187
skyking Avatar answered Sep 18 '22 07:09

skyking


As an alternative i suggest to use itertools.izip_longest (and zip_longest in python 3 )to create a generator contains consecutive pairs :

You can use tee to create 2 independent iterators from a first iterable.

from itertools import izip_longest,tee

def checkSorted(it):
  pre,it=tee(it)
  next(it)
  for i,j in izip_longest(pre,it):
    if j:
      if i >= j:         
        yield i
      else:
        raise ValueError("Iterator is not sorted")
    else :
        yield i

Demo :

it=iter([5,4,3,2,1])
print list(checkSorted(it))

[5, 4, 3, 2, 1]

it=iter([5,4,3,2,3])
print list(checkSorted(it))

Traceback (most recent call last):
  File "/home/bluebird/Desktop/ex2.py", line 19, in <module>
    print list(checkSorted(it))
  File "/home/bluebird/Desktop/ex2.py", line 10, in checkSorted
    raise ValueError("Iterator is not sorted")
ValueError: Iterator is not sorted

Note : Actually I think there is no need to yield the values of your iterable wen you have them already.So as a more elegant way I suggest to use a generator expression within all function and return a bool value :

from itertools import izip,tee

def checkSorted(it):
  pre,it=tee(it)
  next(it)
  return all(i>=j for i,j in izip(pre,it))
like image 29
Mazdak Avatar answered Sep 21 '22 07:09

Mazdak