Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String sorting using Merge Sort

What will be the worst complexity for sorting n strings having n characters each? Will it be just n times its avg. case O(n log n) or something else...?

like image 831
Abhishek Avatar asked Feb 26 '12 00:02

Abhishek


2 Answers

When you are talking about O notation with two things with different lengths, typically you want to use different variables, like M and N.

So, if your merge sort is O(N log N), where N is the number of strings... and comparing two strings is O(M) where M scales with the length of the string, then you'll be left with:

O(N log N) * O(M)

or

O(M N log N)

where M is the string length and N is the number of strings. You want to use different labels because they don't mean the same thing.

In the strange case where the average string length scales with the number of strings, like if you had a matrix stored in strings or something like that, you could argue that M = N, and then you'd have O(N^2 log N)

like image 92
Donald Miner Avatar answered Nov 15 '22 05:11

Donald Miner


As @orangeoctopus, using standard ranking algorithm on a collection of n strings of size n will result in O(n^2 * logn) computation.

However - note that you can do it in O(n^2), with variations on radix sort.

The simplest way to do it [in my opinion] - is

  1. build a trie, and populate it with all your strings. Entering each string is O(n) and you do it n times - total of O(n^2)
  2. do a DFS on the trie, each time you encounter the mark for end for string - add it to the sorted collection. The order of the strings added this way is lexicographically, so your list will be sorted lexicographically when you are done.

It is easy to see you cannot do it any better then O(n^2), since only reading the data is O(n^2), thus this solution is optimal in terms of big O notation of time complexity.

like image 38
amit Avatar answered Nov 15 '22 05:11

amit