Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A Python Segmentation Fault?

This generates a Segmentation Fault: 11 and I have no clue why.

Before I get into it, here's the code:

import numpy.random as nprnd
import heapq
import sys

sys.setrecursionlimit(10**6)


def rlist(size, limit_low, limit_high):
    for _ in xrange(size): 
        yield nprnd.randint(limit_low, limit_high)

def iterator_mergesort(iterator, size):
    return heapq.merge(
         iterator_mergesort(
           (iterator.__next__ for _ in xrange(size/2)), size/2),
         iterator_mergesort(
            iterator, size - (size/2))
       )

def test():
    size = 10**3
    randomiterator = rlist(size, 0, size)
    sortediterator = iterator_mergesort(randomiterator, size)
    assert sortediterator == sorted(randomiterator)

if __name__ == '__main__':
    test()

Basically, it's just a mergesort that works on iterators and generator expressions instead of working on lists so as to minimize the memory footprint at any one time. It's nothing special, and uses the heapq.merge() built-in method for merging iterators, so I was quite surprised when everything breaks.

Running the code quickly gives Segmentation Fault: 11 and an error window telling me python has crashed. I have no idea where to look or how to debug this one, so any help would be much appreciated.

like image 816
reem Avatar asked Oct 01 '13 23:10

reem


1 Answers

Segmentation Faults in python happen for one of two reasons:

You run out of memory

Bug in a C module

Here, the seg fault belongs to the first. You (I) have a boundless recursion because there is no base case in iterator_mergesort(), it'll keep calling itself on itself forever and ever.

Normally, python throws an exception for this and it will terminate before causing a segfault. However, the recursion limit has been set extremely high so python runs out of memory and breaks before it recognizes it should throw an exception for an unbounded recursion.

Add a base case like so:

...
def iterator_mergesort(iterator, size):
return heapq.merge(
         iterator_mergesort(
           (iterator.next() for _ in xrange(size/2)), size/2),
         iterator_mergesort(
            iterator, size - (size/2))
       ) if size >= 2 else iterator #<-- Specifically this

Now it passes the test() function and sorts, albeit rather slowly.

like image 193
reem Avatar answered Oct 13 '22 17:10

reem