I'm writing an integration test that is passing a complex order by
to PostgreSQL and then checking that the data comes back in the correct order. I'm writing this integration test in Java and its String.compareTo
method appears to sort things differently from PostgreSQLs. I ran this on my PostgreSQL database:
SELECT regexp_split_to_table('D d a A c b', ' ') ORDER BY 1;
It responded with this:
a
A
b
c
d
D
I then created this unit test to compare that to the way Java sorts things:
import com.google.common.collect.Lists;
import com.google.common.collect.Ordering;
import org.junit.Test;
import java.util.List;
import static junit.framework.Assert.assertEquals;
public class PostgresqlSortOrderTest {
@Test
public void whenJavaSortsStringsThenItIsTheSameAsWhenPostgresqlSortsStrings() {
List<String> postgresqlOrder = Lists.newArrayList("a", "A", "b", "c", "d", "D");
Ordering<String> ordering = new Ordering<String>() {
@Override
public int compare(String left, String right) {
return left.compareTo(right);
}
};
List<String> javaOrdering = ordering.sortedCopy(postgresqlOrder);
assertEquals(postgresqlOrder, javaOrdering);
}
}
This failed with this output:
Expected :[a, A, b, c, d, D] //postgresql
Actual :[A, D, a, b, c, d] //java
I'm very ignorant of the terminology here. I'd like to know the names of these different String sorts so I can communicate better. But more importantly, how can I make Java sort like PostgreSQL does?
Late to show with an answer, but I'm afraid a simple case insensitive search isn't necessarily going to do what you want.
The keyword you want in your searches is collation (and in a wider sense locales) and PostgreSQL relies on the underlying operating-system to provide support for this. The ordering is rarely a simple character-by-character comparison. For example, in many locales spaces are ignored (that's certainly the case in en_GB).
Also, this means you can end up with different sort orders on different platforms (depending on whether Apple or Microsoft agree with Linus as to the default ordering for your country).
There has been some discussion as to whether it would make sense to include a BSD-licenced library to provide a consistent set of orderings across platforms. However, this is a lot of work and then means you can end up with different sorting inside your database from the rest of your operating-system. While different providers disagree on how to handle this, there's no one simple solution I'm afraid.
You might want to investigate the "C" collation for "traditional" sorting. I'm afraid I can't comment on Java's handling of proper locale sorting - not my field.
Use comparator:String.CASE_INSENSITIVE_ORDER
to sort any Collection
of Strings
. It is already implented in String
class.
Look at the Field Summary at: http://docs.oracle.com/javase/6/docs/api/java/lang/String.html
Here is a comparator that matches the PostgreSQL ordering using the en_GB locale:
Comparator<String> comparator = (left, right) -> {
Collator collator = Collator.getInstance(Locale.UK);
collator.setStrength(Collator.PRIMARY);
return collator.compare(left.replaceAll("\\p{Punct}", ""), right.replaceAll("\\p{Punct}", ""));
};
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With