Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sorting strings containing numbers in a user friendly way

Tags:

Being used to the standard way of sorting strings, I was surprised when I noticed that Windows sorts files by their names in a kind of advanced way. Let me give you an example:

Track1.mp3
Track2.mp3
Track10.mp3
Track20.mp3

I think that those names are compared (during sorting) based on letters and by numbers separately.

On the other hand, the following is the same list sorted in a standard way:
Track1.mp3
Track10.mp3
Track2.mp3
Track20.mp3

I would like to create a comparing alogorithm in Delphi that would let me sort strings in the same way. At first I thought it would be enough to compare consecutive characters of two strings while they are letters. When a digit would be found at some position of both the strings, I would read all digits following them to form a number and then compare the numbers.

To give you an example, I'll compare "Track10" and "Track2" strings this way:
1) read characters while they are equal and while they are letters: "Track", "Track"
2) if a digit is found, read all following digits: "10", "2"
2a) if they are equal, go to 1 or else finish
Ten is greater than two, so "Track10" is greater than "Track2"

It had seemed that everything would be all right until I noticed, during my tests, that Windows considered "Track010" lower than "Track10", while I thought the first one was greater as it was longer (not mentioning that according to my algorithm both the strings would be equal, which is wrong).

Could you provide me with the idea how exactly Windows sorts files by names or maybe you have a ready-to-use algorithm (in any programming language) that I could base on?

Thanks a lot!
Mariusz

like image 772
Mariusz Schimke Avatar asked Jun 20 '09 18:06

Mariusz Schimke


People also ask

Can you sort a list which contains both numeric and string data?

Definition and Usage Note: You cannot sort a list that contains BOTH string values AND numeric values.


1 Answers

Jeff wrote up an article about this on Coding Horror. This is called natural sorting, where you effectively treat a group of digits as a single "character". There are implementations out there in every language under the sun, but strangely it's not usually built-in to most languages' standard libraries.

like image 84
John Kugelman Avatar answered Sep 21 '22 00:09

John Kugelman