Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sorting String with non-western characters

I wanted to print sorted Polish names of all available languages.

import java.util.*;

public class Tmp
{
  public static void main(String... args)
  {
    Locale.setDefault(new Locale("pl","PL"));
    Locale[] locales = Locale.getAvailableLocales();
    ArrayList<String> langs = new ArrayList<String>();
    for(Locale loc: locales) {
      String  lng = loc.getDisplayLanguage();
      if(!lng.trim().equals("") && ! langs.contains(lng)){
        langs.add(lng);
      }
    }
    Collections.sort(langs);
    for(String str: langs){
      System.out.println(str);
    }
  }
}

Unfortunately I have issue with the sorting part. The output is:

:
:
kataloński
koreański
litewski
macedoński
:
:
węgierski
włoski
łotewski

Unfortunately in Polish ł comes after l and before m so the output should be:

:
:
kataloński
koreański
litewski
łotewski
macedoński
:
:
węgierski
włoski

How can I accomplish that? Is there an universal non-language-dependent method (say I now want to display this and sort in another language with another sorting rules).

like image 390
Pawel P. Avatar asked Mar 20 '13 08:03

Pawel P.


People also ask

How do I sort characters in a String?

The main logic is to toCharArray() method of the String class over the input string to create a character array for the input string. Now use Arrays. sort(char c[]) method to sort character array. Use the String class constructor to create a sorted string from a char array.

Can you sort letters in a String JavaScript?

JavaScript Array sort()The sort() sorts the elements as strings in alphabetical and ascending order.

Can sort function be used for String?

Using this in-built function is fairly easier and faster to perform as compared to writing your own code. However, since the provided sort( ) function also uses the quick sort algorithm to sort the string, only non-spaced strings can be sorted using this function.


3 Answers

try

Collections.sort(langs, Collator.getInstance(new Locale("pl", "PL")));

it will produce

...
litewski
łotewski
...

see Collator API for details

like image 118
Evgeniy Dorofeev Avatar answered Sep 23 '22 13:09

Evgeniy Dorofeev


You should pass a Collator to the sort method:

// sort according to default locale
Collections.sort(langs, Collator.getInstance());

The default sort order is defined by the Unicode codepoints in the string, and that's not the correct alphabetical order in any language.

like image 7
Joni Avatar answered Sep 20 '22 13:09

Joni


Have a look at java.text.Collator.newInstance(Locale). You need to supply the Polish locale in your case. Collators implement the Comparator interface, so you can use that in sort APIs and in sorted datastructures like TreeSet.

like image 2
Dilum Ranatunga Avatar answered Sep 21 '22 13:09

Dilum Ranatunga