I have a problem in comparing strings.I want to compare two "éd" and "ef" french texts like this <pre class="prettyprint"><code>Collator localeSpecificCollator = Collator.getInstance(Locale.FRANCE); CollationKey a = localeSpecificCollator.getCollationKey("éd"); CollationKey b = localeSpecificCollator.getCollationKey("ef"); System.out.println(a.compareTo(b)); </code></pre> This will print <code>-1</code>, but in french alphabet <code>e</code> come before <code>é</code>. But when we compare only <code>e</code> and <code>é</code> like this <pre class="prettyprint"><code>Collator localeSpecificCollator = Collator.getInstance(Locale.FRANCE); CollationKey a = localeSpecificCollator.getCollationKey("é"); CollationKey b = localeSpecificCollator.getCollationKey("e"); System.out.println(a.compareTo(b)); </code></pre> result is <code>1</code>. Can you tell we what is wrong in first part of code?

This seems to be the expected behaviour and it also seems to be the correct way to sort alphabetically in French. The Android javadoc gives a hint as to why it is behaving like that - I suppose the details of the implementation in android are similar, if not identical, to the the standard JDK: <blockquote> A tertiary difference is ignored when there is a primary or secondary difference anywhere in the strings. </blockquote> In other words, because your 2 strings are sortable by only looking at primary differences (excluding the accents) the collator does not check the other differences. It seems to be compliant with the Unicode Collation Algorithm (UCA): <blockquote> Accent differences are typically ignored, if the base letters differ. </blockquote> And it also seems to be the correct way to sort alphabetically in French, according to the wikipedia article on "ordre alphabetique": <blockquote> En première analyse, les caractères accentués, de même que les majuscules, ont le même rang alphabétique que le caractère fondamental Si plusieurs mots ont le même rang alphabétique, on tâche de les distinguer entre eux grâce aux majuscules et aux accents (pour le e, on a l'ordre e, é, è, ê, ë) </blockquote> In English: the order initially ignores accents and case - if 2 words can't be sorted that way, accents and case are then taken into account.

java CollationKey sorting wrong

Tags:

java

compare

locale

I have a problem in comparing strings.I want to compare two "éd" and "ef" french texts like this

Click to copy

Collator localeSpecificCollator = Collator.getInstance(Locale.FRANCE);
CollationKey a = localeSpecificCollator.getCollationKey("éd");
CollationKey b = localeSpecificCollator.getCollationKey("ef");
System.out.println(a.compareTo(b));

This will print -1, but in french alphabet e come before é. But when we compare only e and é like this

Click to copy

Collator localeSpecificCollator = Collator.getInstance(Locale.FRANCE);
CollationKey a = localeSpecificCollator.getCollationKey("é");
CollationKey b = localeSpecificCollator.getCollationKey("e");
System.out.println(a.compareTo(b));

result is 1. Can you tell we what is wrong in first part of code?

570

asked Aug 10 '12 10:08

Ashot

1 Answers

This seems to be the expected behaviour and it also seems to be the correct way to sort alphabetically in French.

The Android javadoc gives a hint as to why it is behaving like that - I suppose the details of the implementation in android are similar, if not identical, to the the standard JDK:

A tertiary difference is ignored when there is a primary or secondary difference anywhere in the strings.

In other words, because your 2 strings are sortable by only looking at primary differences (excluding the accents) the collator does not check the other differences.

It seems to be compliant with the Unicode Collation Algorithm (UCA):

Accent differences are typically ignored, if the base letters differ.

And it also seems to be the correct way to sort alphabetically in French, according to the wikipedia article on "ordre alphabetique":

En première analyse, les caractères accentués, de même que les majuscules, ont le même rang alphabétique que le caractère fondamental
Si plusieurs mots ont le même rang alphabétique, on tâche de les distinguer entre eux grâce aux majuscules et aux accents (pour le e, on a l'ordre e, é, è, ê, ë)

In English: the order initially ignores accents and case - if 2 words can't be sorted that way, accents and case are then taken into account.

152

answered Oct 02 '22 08:10

assylias

Related questions
                            
                                Ivy - output the results of a resolve to an ivy file
                            
                                Will Hibernate saveOrUpdate method delete children?
                            
                                Html Audio in Android Webview
                            
                                HOWTO Resolve warning messages of "restributing to another node" when using Spymemcached client library for memcached server
                            
                                How to TDD a JFrame?
                            
                                New to Ant, ClassNotFoundException with JUnit
                            
                                Hibernate null constraint violation on @Id with @GeneratedValue
                            
                                Eclipse skipping breakpoints
                            
                                Jackson handling Wrapped elements
                            
                                Including Tess4J to a Java project as library in Eclipse
                            
                                Is there a performance hit with a maxElementsInMemory too big in ehcache config
                            
                                What type of connections "RMI TCP Connection(idle)" threads correspond to?
                            
                                What is the size of Boolean array in Java
                            
                                How to implement a Java compiler and DEX converter into an Android app?
                            
                                Create jar file from command line
                            
                                Why does this Scala code throw IllegalAccessError at runtime?
                            
                                Not able to post messages from server : Google Cloud Messaging
                            
                                how can one detect a finished resizing operation in JavaFX?
                            
                                Open-source Java library for HTML5 validation?
                            
                                Spring Mobile: how to add DeviceWebArgumentResolver programmatically?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With