Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which algorithm to use for alphabetical sort?

A lot of sorting algorithms are based on comparisons of numbers. If I correctly understand, when we are using comparison algorithms for alphabetical sort, we compare char codes(their integer presentation) and sort depending on their values. (That's why at ASCII table letter B has a bigger code then A). But during this comparing we sort only by the first letter and not the whole word. When we use db query with ORDER BY we are getting sorting for the whole words. (As I understand the reason of it is db background mechanisms like indexes etc.). Also I heard about Radix sort (sorry, but never used it before) and as I can see it can help with alphabetical sort(maybe I'm wrong).

What algorithm is better to use for sorting by the whole words?

Not correct:

Adam
Aaron
Antony

Correct:

Aaron
Adam
Antony

And am I correct with my assumptions about the whole workflow?

like image 917
Viacheslav Kondratiuk Avatar asked Nov 25 '13 08:11

Viacheslav Kondratiuk


2 Answers

You're not quite correct with the assumption about "Compare only the first letter". The algorithm is - if the first letters are the same, compare the next letter. And the next. And the next. Until either you find some letters that are different, or one of the strings runs out.

Also note that simply comparing by ASCII codes is not always enough. Sometimes you need to do a case-insensitive comparison where you consider A to be equal to a. Sometimes you need to do accent-insensitive comparison where you consider ā to be equal to a. And sometimes you need to account for crazy language shit where ß is equal to ss or worse.

My advice is - your programming language should probably have some mechanism for comparing strings. Use that. Don't roll out your own.

After that, any sorting algorithm will work. They all use one simple assumption - that you can compare the items that you sort. Whether they are integers, strings or complex objects, is irrelevant. As long as you can take any two objects and say "this one is bigger and this one is smaller", you're good to go.

(Note also that you need to be consistent about it. If A==B and B==C, then also you need to make sure that A==C. Similarly if A < B and B < C, then you must A < C. Etc.)

like image 143
Vilx- Avatar answered Sep 18 '22 14:09

Vilx-


No, the sorting is not based on first character or length. Alphabetical or better to put it as lexicographical ordering are done in the following way,

In C++ the comparison function would look like this,

bool operator<(const string &a, const string &b){
    int l = min(a.size(),b.size());
    for(int i = 0; i < l; i++){
        if( a[i] > b[i]) return false; // a is greater than b
        if( b[i] > a[i]) return true;  // b is greater than a
    }
    if ( a.size() > l) return false;   // a is greater than b
    return true;                       // b is greater than a
}
like image 27
Fallen Avatar answered Sep 21 '22 14:09

Fallen