I have two Postgres 9.1 instances: one local, installed via Postgres.app on OS X, and one remote, on Heroku. I've ensured that lc_collate
is en_US.UTF-8
on both machines but am still seeing different behavior between the two.
On my local instance, SELECT 'i' > 'N'
returns t
whereas remotely it returns f
. Given that I've already checked lc_*
on both systems, what explains the difference I'm seeing?
From the point of view of Unicode, the case ordering is a customization. Excerpt from http://www.unicode.org/reports/tr10:
Case Ordering. Some dictionaries and authors collate uppercase before lowercase while others use the reverse, so that preference needs to be customizable. Sometimes the case ordering is mandated by the government, as in Denmark. Often it is simply a customization or user preference.
Mac OS X simply has a different case ordering than the OS used by Heroku. On Mac OS X:
$ LC_CTYPE=en_US.UTF-8 sort << EOF
> i
> N
> EOF
produces:
N
i
The exact same command and same data on Ubuntu 12.04 produces:
i
N
This has none to do with PostgreSQL, except for the fact that it uses the OS for collation, so these unfortunate discrepancies between different OS impact databases.
Starting with version 10, PostgreSQL may use collations provided by the ICU library, for servers compiled with ICU. These collations can sort consistently across operating systems.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With