Hi have some piece of code :
Collator col = Collator.getInstance(Locale.GERMAN);
List< String> list = new ArrayList<String>();
list.add("ac");
list.add("äb");
list.add("aa");
list.add("bb");
Collections.sort(list,col);
System.out.println(list);
I would expect to get [aa, ac, äb, bb] output, instead I am getting : [aa, äb, ac, bb]
I have no idea what I am doing wrong ... thanks in advance for help.
Hi thanks all for answers.
Unfortunately requirements of the project say clearly that strings must be sorted in such order : [aa, ac, äb, bb] : so I tried to use this code :
String europeanRules =
("< a,A ; \u00e0,\u00c0 ; \u00e1,\u00c1 ; \u00e2,\u00c2 ; \u00e3,\u00c3; \u00e4,\u00c4 ; \u00e5,\u00c5 ; \u00e6,\u00c6 "+
"; \u0101,\u0100 ; \u0103,\u0102 ; \u0105,\u0104 " +
"< b,B < c,C ; \u00e7,\u00c7 ; \u0107,\u0106 ; \u0109,\u0108 ; \u010b,\u010a ; \u010d,\u010c " +
"< d,D ; \u010f,\u010e ; \u0111,\u0110 " +
"< e,E ; \u00e8,\u00c8 ; \u00e9,\u00c9 ; \u00ea,\u00ca ; \u00eb,\u00cb " +
"; \u0113,\u0112 ; \u0115,\u0114 ; \u0116,\u0117 ; \u0119,\u0118 ; \u011b,\u011a " +
"< f,F < g,G < h,H " +
"< i,I ; \u00ec,\u00cc ; \u00ed,\u00cd ; \u00ee,\u00ce ; \u00ef,\u00cf " +
"< j,J < k,K " +
"< l,L ; \u013a,\u0139 ; \u013c,\u013b ; \u013e,\u013d ; \u0140,\u013f ; \u0142,\u0141 " +
"< m,M < n,N ; \u00f1,\u00d1 ; \u0144,\u0143 ; \u0146,\u0145 ; \u0148,\u0147 " +
"< o,O ; \u00f2,\u00d2 ; \u00f3,\u00d3 ; \u00f4,\u00d4 ; \u00f5,\u00d5 ; \u00f6,\u00d6 ; \u00f8,\u00d8 " +
"; \u014d,\u014c ; \u014f,\u014e ; \u0151,\u0150 " +
"< p,P < q,Q < r,R ; \u0155,\u0154 ; \u0157,\u0156 ; \u0159,\u0158 " +
"< s,S ; \u015b,\u015a ; \u015d,\u015c ; \u015f,\u015e ; \u0161,\u0160 " +
"< t,T ; \u0163,\u0162 ; \u0165,\u0164 ; \u0167,\u0166 " +
"< u,U ; \u00f9,\u00d9 ; \u00fa,\u00da ; \u00fb,\u00db ; \u00fc,\u00dc ; \u0169,\u0168 ; \u016b,\u016a ; \u016d,\u016c " +
"; \u016f,\u016e ; \u0171,\u0170 ; \u0173,\u0172 " +
"< v,V < w,W ; \u0175,\u0174 " +
"< x,X < y,Y ; \u00fd,\u00dd ; \u00ff ; \u0177,\u0176 ; \u0178 " +
"< z,Z ; \u017a,\u0179 ; \u017c,\u017b ; \u017e,\u017d");
RuleBasedCollator col = null;
try {
col = new RuleBasedCollator(europeanRules);
} catch (ParseException e) {
}
col.setStrength(Collator.SECONDARY);
col.setDecomposition(Collator.FULL_DECOMPOSITION);
List< String> list = new ArrayList<String>();
list.add("ac");
list.add("äb");
list.add("aa");
list.add("bb");
Collections.sort(list,col);
System.out.println(list);
00E4 is UTF-8 code for ä so as I understand it should work ok ? Or I am doing something wrong ... thanks in advance for help.
The order you get is correct, at least according to the Wikipedia entry for this subject (sorry in German, Google Translate might help you, although it corrupts the umlauts for me...)
If you want your accented characters to always come after the normal ones, you can prepend an @
in your defined rule for the RuleBasedCollator.
The definitions of the rule elements is as follows:
[...]
Modifier: There are currently two modifiers that turn on special collation rules.
'@' : Turns on backwards sorting of accents (secondary differences), as in French.
'!' : Turns on Thai/Lao vowel-consonant swapping. If this rule is in force when a Thai vowel of the range \U0E40-\U0E44 precedes a Thai consonant of the range \U0E01-\U0E2E OR a Lao vowel of the range \U0EC0-\U0EC4 precedes a Lao consonant of the range \U0E81-\U0EAE then the vowel is placed after the consonant for collation purposes.
[...]
So your sample code would look like follows:
(I made the change only for the ä
character, i.e. @\u00e4, @\u00c4
)
String europeanRules =
("< a,A ; \u00e0,\u00c0 ; \u00e1,\u00c1 ; \u00e2,\u00c2 ; \u00e3,\u00c3; @\u00e4,@\u00c4 ; \u00e5,\u00c5 ; \u00e6,\u00c6 "+
"; \u0101,\u0100 ; \u0103,\u0102 ; \u0105,\u0104 " +
"< b,B < c,C ; \u00e7,\u00c7 ; \u0107,\u0106 ; \u0109,\u0108 ; \u010b,\u010a ; \u010d,\u010c " +
"< d,D ; \u010f,\u010e ; \u0111,\u0110 " +
"< e,E ; \u00e8,\u00c8 ; \u00e9,\u00c9 ; \u00ea,\u00ca ; \u00eb,\u00cb " +
"; \u0113,\u0112 ; \u0115,\u0114 ; \u0116,\u0117 ; \u0119,\u0118 ; \u011b,\u011a " +
"< f,F < g,G < h,H " +
"< i,I ; \u00ec,\u00cc ; \u00ed,\u00cd ; \u00ee,\u00ce ; \u00ef,\u00cf " +
"< j,J < k,K " +
"< l,L ; \u013a,\u0139 ; \u013c,\u013b ; \u013e,\u013d ; \u0140,\u013f ; \u0142,\u0141 " +
"< m,M < n,N ; \u00f1,\u00d1 ; \u0144,\u0143 ; \u0146,\u0145 ; \u0148,\u0147 " +
"< o,O ; \u00f2,\u00d2 ; \u00f3,\u00d3 ; \u00f4,\u00d4 ; \u00f5,\u00d5 ; \u00f6,\u00d6 ; \u00f8,\u00d8 " +
"; \u014d,\u014c ; \u014f,\u014e ; \u0151,\u0150 " +
"< p,P < q,Q < r,R ; \u0155,\u0154 ; \u0157,\u0156 ; \u0159,\u0158 " +
"< s,S ; \u015b,\u015a ; \u015d,\u015c ; \u015f,\u015e ; \u0161,\u0160 " +
"< t,T ; \u0163,\u0162 ; \u0165,\u0164 ; \u0167,\u0166 " +
"< u,U ; \u00f9,\u00d9 ; \u00fa,\u00da ; \u00fb,\u00db ; \u00fc,\u00dc ; \u0169,\u0168 ; \u016b,\u016a ; \u016d,\u016c " +
"; \u016f,\u016e ; \u0171,\u0170 ; \u0173,\u0172 " +
"< v,V < w,W ; \u0175,\u0174 " +
"< x,X < y,Y ; \u00fd,\u00dd ; \u00ff ; \u0177,\u0176 ; \u0178 " +
"< z,Z ; \u017a,\u0179 ; \u017c,\u017b ; \u017e,\u017d");
RuleBasedCollator col = null;
try {
col = new RuleBasedCollator(europeanRules);
} catch (ParseException e) {
}
col.setStrength(Collator.SECONDARY);
col.setDecomposition(Collator.FULL_DECOMPOSITION);
List< String> list = new ArrayList<String>();
list.add("ac");
list.add("äb");
list.add("aa");
list.add("af");
list.add("bb");
Collections.sort(list,col);
System.out.println(list);
The output is:
[aa, ac, af, äb, bb]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With