Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to sort millions of rows of data in a file with less/meagre memory

Tags:

algorithm

(From here)

I attended an interview last week and this question was asked:

How do you sort a billion rows of data in a file with only 640KB of memory in a 8080 processor based machine? No virtual memory, no external disk.

I explicitly asked the interviewer if I could use a hard drive, so I can serialize trees as I sort them and then combine at the end. He said no. I tried many ways, different algorithms. Nothing he agreed.

I gave up and asked him politely, "how would you do that?" He bluntly said, "I would not tell you." (The interview ended right after that. I didn't mean to offend him, as a developer, I got curious. Moreover, it was an instinctive question, just as I would ask anyone at my workplace.)

This interview was for a really big bank.

So, how would anyone approach this problem?

like image 874
Matty H Avatar asked Oct 18 '10 16:10

Matty H


2 Answers

I would not do it in C#, for starters. Are you sure you have this tagged right? This is a C problem, if it can be solved.

640K only gives you 640 * 1024 * 8 bits so there's no way to solve this as framed. Perhaps that's the answer he/she was looking for. These Investment Bank interviews are something of a mindgame sometimes.

like image 27
Steve Townsend Avatar answered Oct 15 '22 06:10

Steve Townsend


Heapsort would be my reccomendation. It's relatively quick when n is large, and you only have to look at three elements with definite indecies at once.

That being said, my intuition tells me that sorting a billion rows on an 8080 even in C would be unfeasibly slow.

like image 107
Squirrelsama Avatar answered Oct 15 '22 05:10

Squirrelsama