Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mergesort with Python

I couldn't find any working Python 3.3 mergesort algorithm codes, so I made one myself. Is there any way to speed it up? It sorts 20,000 numbers in about 0.3-0.5 seconds

def msort(x):     result = []     if len(x) < 2:         return x     mid = int(len(x)/2)     y = msort(x[:mid])     z = msort(x[mid:])     while (len(y) > 0) or (len(z) > 0):         if len(y) > 0 and len(z) > 0:             if y[0] > z[0]:                 result.append(z[0])                 z.pop(0)             else:                 result.append(y[0])                 y.pop(0)         elif len(z) > 0:             for i in z:                 result.append(i)                 z.pop(0)         else:             for i in y:                 result.append(i)                 y.pop(0)     return result 
like image 569
Hans Avatar asked Sep 12 '13 10:09

Hans


People also ask

Does Python have Mergesort?

Merge sort is one of the most prominent divide-and-conquer sorting algorithms in the modern era. It can be used to sort the values in any traversable data structure such as a list.

How do you do a Mergesort?

Algorithm for Merge SortStep 1: Find the middle index of the array. Step 2: Divide the array from the middle. Step 4: Call merge sort for the second half of the array. Step 5: Merge the two sorted halves into a single sorted array.

What is mergesort algorithm?

Merge sort is a sorting technique based on divide and conquer technique. With worst-case time complexity being Ο(n log n), it is one of the most respected algorithms. Merge sort first divides the array into equal halves and then combines them in a sorted manner.


2 Answers

The first improvement would be to simplify the three cases in the main loop: Rather than iterating while some of the sequence has elements, iterate while both sequences have elements. When leaving the loop, one of them will be empty, we don't know which, but we don't care: We append them at the end of the result.

def msort2(x):     if len(x) < 2:         return x     result = []          # moved!     mid = int(len(x) / 2)     y = msort2(x[:mid])     z = msort2(x[mid:])     while (len(y) > 0) and (len(z) > 0):         if y[0] > z[0]:             result.append(z[0])             z.pop(0)         else:             result.append(y[0])             y.pop(0)     result += y     result += z     return result 

The second optimization is to avoid popping the elements. Rather, have two indices:

def msort3(x):     if len(x) < 2:         return x     result = []     mid = int(len(x) / 2)     y = msort3(x[:mid])     z = msort3(x[mid:])     i = 0     j = 0     while i < len(y) and j < len(z):         if y[i] > z[j]:             result.append(z[j])             j += 1         else:             result.append(y[i])             i += 1     result += y[i:]     result += z[j:]     return result 

A final improvement consists in using a non recursive algorithm to sort short sequences. In this case I use the built-in sorted function and use it when the size of the input is less than 20:

def msort4(x):     if len(x) < 20:         return sorted(x)     result = []     mid = int(len(x) / 2)     y = msort4(x[:mid])     z = msort4(x[mid:])     i = 0     j = 0     while i < len(y) and j < len(z):         if y[i] > z[j]:             result.append(z[j])             j += 1         else:             result.append(y[i])             i += 1     result += y[i:]     result += z[j:]     return result 

My measurements to sort a random list of 100000 integers are 2.46 seconds for the original version, 2.33 for msort2, 0.60 for msort3 and 0.40 for msort4. For reference, sorting all the list with sorted takes 0.03 seconds.

like image 120
anumi Avatar answered Sep 21 '22 18:09

anumi


Code from MIT course. (with generic cooperator )

import operator   def merge(left, right, compare):     result = []     i, j = 0, 0     while i < len(left) and j < len(right):         if compare(left[i], right[j]):             result.append(left[i])             i += 1         else:             result.append(right[j])             j += 1     while i < len(left):         result.append(left[i])         i += 1     while j < len(right):         result.append(right[j])         j += 1     return result   def mergeSort(L, compare=operator.lt):     if len(L) < 2:         return L[:]     else:         middle = int(len(L) / 2)         left = mergeSort(L[:middle], compare)         right = mergeSort(L[middle:], compare)         return merge(left, right, compare) 
like image 31
David Yachnis Avatar answered Sep 19 '22 18:09

David Yachnis