Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most efficient sorting algorithm for a large set of numbers

I'm working on a large project, I won't bother to summarize it here, but this section of the project is to take a very large document of text (minimum of around 50,000 words (not unique)), and output each unique word in order of most used to least used (probably top three will be "a" "an" and "the").

My question is of course, what would be the best sorting algorithm to use? I was reading of counting sort, and I like it, but my concern is that the range of values will be too large compared to the number of unique words.

Any suggestions?

like image 803
aterimperator Avatar asked Jun 05 '09 03:06

aterimperator


People also ask

Which sorting algorithm is the most efficient for a large number of elements?

Quicksort is one of the most efficient sorting algorithms, and this makes of it one of the most used as well. The first thing to do is to select a pivot number, this number will separate the data, on its left are the numbers smaller than it and the greater numbers on the right.

Which is the fastest algorithm for sorting?

But since it has the upper hand in the average cases for most inputs, Quicksort is generally considered the “fastest” sorting algorithm.

Which sorting algorithm is the slowest algorithm for a large number of data?

3) Which sorting algorithm is the slowest algorithm for large number of data? Explanation: Quick sort, Heap sort and Shell sort all have best case time complexity as O(n log n) and Bubble sort has time complexity of O(n2). So, Bubble sort is slowest.


2 Answers

First, you will need a map of word -> count. 50,000 words is not much - it will easily fit in memory, so there's nothing to worry. In C++ you can use the standard STL std::map.

Then, once you have the map, you can copy all the map keys to a vector.

Then, sort this vector using a custom comparison operator: instead of comparing the words, compare the counts from the map. (Don't worry about the specific sorting algorithm - your array is not that large, so any standard library sort will work for you.)

like image 87
Igor Krivokon Avatar answered Nov 15 '22 21:11

Igor Krivokon


I'd start with a quicksort and go from there.

Check out the wiki page on sorting algorithms, though, to learn the differences.

like image 38
Eric Avatar answered Nov 15 '22 23:11

Eric