Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String.prototype.localeCompare handles casing inconsistently?

See the following minimal examples. All examples were tested in Chrome 76.0.3809.100, Firefox 68.0.2 and Node.js 11.15.0. All yielded the same results.

For the sake of completeness, all relevant options are provided (however they do not really count: the default sensitivity is 'variant', which works practically the same way as 'case' for unaccented characters, the default 'sort' usage is used).

No setting was able to resolve the contradiction detailed below. I also tried with several language options, but to no avail.

Example 1.

The following is correct: 'a' comes before 'b'.

const result = 'a'.localeCompare('b', 'en', {
  sensitivity: 'case',
  usage: 'sort',
  caseFirst: 'lower'
});
// -1

Example 2.

The following is correct: with caseFirst: 'lower' set, 'b' comes before 'B'.

const result = 'b'.localeCompare('B', 'en', {
  sensitivity: 'case',
  usage: 'sort',
  caseFirst: 'lower'
});
// -1

Example 3.

The following is also correct. The caseFirst option is not required to be supported by implementations, but it is. With caseFirst: 'upper' set, 'b' comes after 'B'.

const result = 'b'.localeCompare('B', 'en', {
  sensitivity: 'case',
  usage: 'sort',
  caseFirst: 'upper'
});
// 1

Example 4.

The following is also correct. Since 'b' comes before 'B', 'b{anything}' comes before 'B{anything}':

const result = 'ba'.localeCompare('Ba', 'en', {
  sensitivity: 'case',
  usage: 'sort',
  caseFirst: 'lower'
});
// -1

Example 5.

The following 1 result is incorrect I think, as it contradicts the 'b{anything}' comes before 'B{anything}' statement:

const result = 'bb'.localeCompare('Ba', 'en', {
  sensitivity: 'case',
  usage: 'sort',
  caseFirst: 'lower'
});
// 1

According to this, 'bb' comes after 'Ba'. It would make sense with setting either sensitivity: 'base' (comparing case-insensitively) or caseFirst: 'upper' (they produce the same 1 output with those settings — correctly —, I tried).

But with this setting, I would expect a -1 result: 'bb' comes before 'Ba', as alphabetically their first letters, 'b' and 'B' determine their order (and 'b' comes before 'B', as Example 2. shows).

Why does localeCompare behave like this?

like image 670
Soma Lucz Avatar asked Aug 27 '19 09:08

Soma Lucz


1 Answers

It works actually not that way, because it compare the letter position for all and not the string with their positions of the letters.

You could take a work around and compare each letter with the one of the other string (maybe this need a processing of the min lenght letter as well).

var array = ['a', 'b', 'bb', 'Bb', 'ba', 'BA', 'B', 'bA'];

array.sort();
console.log(...array);

array.sort((a, b) => a.localeCompare(b, 'kf', { sensitivity: 'case', caseFirst: 'lower' }));
console.log(...array);

array.sort((a, b) => {
    var r;
    Array.from(a).some((c, i) => r = c.localeCompare(b[i], 'kf', { sensitivity: 'case', caseFirst: 'lower' }));
    return r;
});
console.log(...array);
like image 154
Nina Scholz Avatar answered Nov 15 '22 07:11

Nina Scholz