Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to sort (million/billion/...) integers?

Sometimes interviewers ask how to sort million/billion 32-bit integers (e.g. here and here). I guess they expect the candidates to compare O(NLog(N)) sort with radix sort. For million integers O(NLog(N)) sort is probably better but for billion they are probably the same. Does it make sense ?

like image 774
Michael Avatar asked Nov 08 '10 19:11

Michael


People also ask

What is the most efficient way to sort a million integers?

Integer Sorting with limited range is achieved efficiently with Bucket Sorting. Bucket sort, counting sort, radix sort, and van Emde Boas tree sorting all work best when the key size is small; for large enough keys, they become slower than comparison sorting algorithm.

What's the fastest way to sort 1 million 32 bit integers?

If it's meant as speed, the radix sort is the fastest theoretical sort for this. Especially as we know that the integers are 32bit and thus can only be in a range of values. That range however, is quite large. A radix sort is a O(N) algorithm - meaning it needs to step through each item in the order of once per item.

How do you find the 100 largest numbers out of an array of 1 billion numbers?

Use Heap building the MAX heap will take O(N),then the top 100 max numbers will be in the top of the Heap, all you need is to get them out from the heap(100 X O(Log(N)).

What is the best and most practical sorting algorithm to use for sorting an array of 1 billion integers?

Solution: Generally speaking, Quick sort is probably the best algorithm to sort numbers when nothing is given about the nature (like range, order etc.) of numbers.


2 Answers

If you get a question like this, they are not looking for the answer. What they are trying to do is see how you think through a problem. Do you jump right in, or do you ask questions about the project requirements?

One question you had better ask is, "How optimal of solution does the problem require?" Maybe a bubble sort of records stored in a file is good enough, but you have to ask. Ask questions about what if the input changes to 64 bit numbers, should the sort process be easily updated? Ask how long does the programmer have to develop the program.

Those types of questions show me that the candidate is wise enough to see there is more to the problem than just sorting numbers.

like image 129
aaaa bbbb Avatar answered Oct 04 '22 14:10

aaaa bbbb


I expect they're looking for you to expand on the difference between internal sorting and external sorting. Apparently people don't read Knuth nowadays

like image 36
The Archetypal Paul Avatar answered Oct 04 '22 12:10

The Archetypal Paul