I have an array containing Roman numerals (as strings of course). Like this:
$a = array('XIX', 'LII', 'V', 'MCCXCIV', 'III', 'XIII');
I'd like to sort them according to the numeric values of these numerals, so the results should be something like:
$sorted_a = array('III', 'V', 'XIII', 'XIX', 'LII', 'MCCXCIV');
So my question is: what is the best way to sort an array of Roman numerals? I know how to use the array sorting functions of PHP, I'm interested in the logic that goes on inside the comparison function.
EDIT: For simplicity, I'm only looking for a way that deals with strings constructed of the basic numerals in a standard way (no CCCC
for example):
I, V, X, L, C, D, M
TEST RESULTS
I took the time to extensively test all the code examples that were posted. Two tests were taken, one with a random array of 20 Roman numerals, and a second with an array containing 4000 of those. Same machine, lot of iterations, an average time taken, and all this run several times. Of course this is nothing offical, just my own tests.
TEST WITH 20 NUMERALS:
TEST WITH 4000 NUMERALS:
I have a hard time awarding the bounty. hakre and I made the fastest versions, following the same route, but he made a variation of mine, which was previously based on borrible's idea. So I will accept hakre's solution, because that is the quickest and nicer than mine (IMO). But I will award the bounty to anemgyenge, because I love his version and a lot of effort seems to be put into it.
The ancient Romans had their own method of writing numbers. The symbols: I, V, X, L, C, D, M represented the numbers: 1, 5, 10, 50, 100, 500, 1000. For example, the number 157 is written CLVII in Roman numerals. The numbers are written from left to right in descending order M > D > C > L > V > I.
The Excel ROMAN function converts a number to a Roman numeral as text. For example, the formula =ROMAN(4) returns IV. number - Number (in Arabic numeral) you want to convert to Roman numeral. form - [optional] The type of Roman numeral you want.
Basically, you'd use a separate function to convert from numeral to integer, and use a sort key function that takes any one string in the list, does the splitting and returns the name and converted numeral: def sortkey(s): , name, numeral = s.
Picking your class to convert roman numbers to integers, a user-defined sort callback can handle this to sort the array:
$a = array('XIX', 'LII', 'V', 'MCCXCIV', 'III', 'XIII'); $bool = usort($a, function($a, $b) { return RomanNumber::Roman2Int($a) - RomanNumber::Roman2Int($b); }); var_dump($a);
So here you find the logic inside the comparison function: if both values are of the same weight, return 0
. If the first is lower than the second, return < 0
(e.g. -1
), otherwise the second is larger than the first so return > 0
(e.g. 1
).
Naturally any other type of function that returns the decimal value for a roman number would work as well.
Edit:
As you commented, you do not want to run the conversion for each pair. That's fine, with a help of an additional array which contains all converted values, you can run the sort on the decimal values and use that sorting on the roman numbers as well (Demo):
$a = array('XIX', 'LII', 'V', 'MCCXCIV', 'III', 'XIII'); $b = array_map('RomanNumber::Roman2Int', $a); array_multisort($b, $a); var_dump($a);
array_multisort
PHP Manual does most of the magic here.
function sortRomanNum($a, $b) { if($a == $b) return 0; $str = "0IVXLCDM"; $len = 0; if(strlen($a) >= strlen($b)) { $len = strlen($a); $b .= str_repeat("0", $len - strlen($b)); } else { $len = strlen($b); $a .= str_repeat("0", $len - strlen($a)); } for($i = 0; $i < $len - 1; $i++) { $a1 = $a[$i]; $b1 = $b[$i]; $a2 = $a[$i+1]; $b2 = $b[$i+1]; if( strpos($str, $a1.$b1.$a2) !== false ) return 1; if( strpos($str, $b1.$a1.$b2) !== false ) return -1; if($a1 != $b1) return strpos($str, $a1) > strpos($str, $b1) ? 1 : -1; } if($a[$i] != $b[$i]) return strpos($str, $a[$i]) > strpos($str, $b[$i]) ? 1 : -1; }
Given two numbers (roman strings), $a and $b. If there are no substractions in the numbers (IV, IX, XC etc), then the solution would be trivial:
for all $i in $a and $b if $a[$i] > $b[$i] then return 1; //($a is greater then $b) if $a[$i] < $b[$i] then return 1; //($a is lower then $b) return 0 //equality
Since there can be these special parts, the calculation is more complex. But the solution is to find the patterns:
a: IX | XC | CM b: V | L | D
These are the only patterns which can mess up the trivial solution. If you find any of these, then $a will be greater then $b.
Note, that roman numbers don't include zeros, like the arabic ones. Therefore now we will use them (and basically put zeros where they are missing).
So here comes the function:
if $a == $b then return 0; //equality create a string for ordering the roman numerals (strpos will give the right index) define the length of the loop (take the longer string), and add zeros to the end of the shorter number run the loop, and check: 1. if the patterns above are found, return the comparision accordingly (1 or -1) 2. otherwise do the trivial check (compare each numeral) check the last numerals too.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With