Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Preferred Sorting For People Based On Their Age

Suppose we have 1 million entries of an object 'Person' with two fields 'Name', 'Age'. The problem was to sort the entries based on the 'Age' of the person.

I was asked this question in an interview. I answered that we could use an array to store the objects and use quick sort as that would save us from using additional space but interviewer told that memory was not a factor.

My question is what would be the factor that would decide which sort to use?

Also what would be the preferred way to store this?

In this scenario does any sorting algorithm have an advantage over another sorting algorithm and would result in a better complexity?

like image 765
Naman Choradia Avatar asked Sep 17 '16 06:09

Naman Choradia


People also ask

Which sorting technique is best for sorting?

Quicksort. Quicksort is one of the most efficient sorting algorithms, and this makes of it one of the most used as well. The first thing to do is to select a pivot number, this number will separate the data, on its left are the numbers smaller than it and the greater numbers on the right.

What are the 5 Classification of sorting?

Some adaptive sorting algorithms are : Bubble Sort, Insertion Sort and Quick Sort. On the other hand some non-adaptive sorting algorithms are : Selection Sort, Merge Sort, and Heap Sort. Internal Sorting : Sorting algorithms that use main memory exclusively during the sort are called internal sorting algorithms.

What are the 3 basic sorting categories?

What are the three types of sorting? The three types of basic sorting are bubble sort, insertion sort and selection sort.


2 Answers

This Stackoverflow link may be useful to you.

The answers above are sufficient but i would like to add some more information from the link above.

I am copying some information from the answers in, the link above, over here.

We should note that even if the fields in the Object are very big (i.e. long names) you do not need to use a file system sort, you can use an in-memory sort, because

# elements * 8 ~= 762 MB (most modern systems have enough memory for that)
             ^
        key(age) + pointer to struct requires 8 bytes in 32 bits system

It is important to minimize the disk accesses - because disks are not random access, and disk accesses are MUCH slower then RAM accesses.

Now, use a sort of your choice on that - and avoid using disk for the sorting process.

Some possibilities of sorts (on RAM) for this case are:

  • Standard quicksort or merge-sort (Which you had already thought of)
  • Bucket sort can also be applied here, since the rage is limited to [0,150] (Which others have specified here under the name Count Sort)
  • Radix sort (For the same reason, radix sort will need ceil(log_2(150)) ~= 8 iterations

I wanted to point out the memory aspect in case you may encounter the same question but may need to answer it taking the memory constraints into consideration. In fact your constraints are even less(10^6 compared to the 10^8 in the other question).

As for the matter of storing it -

The quickest way to sort it would be to allocate 151 linked lists/vector (let's call them buckets or whatever you may depending on the language you prefer) and put each person's data structure in the bucket according to his/her age(all people's ages are between 0 and 150):

bucket[person->age].add(person)

As others have pointed out Bucket Sort is going to be the better option for you.

In fact the beauty of bucket sort is that if you have to perform any operation on ranges of ages(like from 10-50 years of age) you can partition your bucket sizes according to your requirements(like have varied bucket range for each bucket).

I repeat again i have copied the information from the answers in the link given above, but i believe they might be useful to you.

like image 138
Rahul Nori Avatar answered Sep 21 '22 17:09

Rahul Nori


If the array has n elements, then quicksort (or, actually, any comparison-based sort) is Ω(n log(n)).

Here, though, it looks like you have here an alternative to comparison-based sorting, since you need to sort only on age. Suppose there are m distinct ages. In this case, Counting Sort, will be Θ(m + n). For the specifics of your question, assuming that age is in years, m is much smaller than n, and you can do this in linear time.

The implementation is trivial. Simply create an array of, say, 200 entries (200 being an upper bound on the age). The array is of linked lists. Scan over the people, and place each person in the linked list in the appropriate entry. Now, just concatenate the lists according to the positions in the array.

like image 36
Ami Tavory Avatar answered Sep 24 '22 17:09

Ami Tavory