Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mysterious Postgres string comparison result when string contains certain symbol characters

My Postgres server gives me very confusing result while I try to compare strings which contains slash '/' or question mark '?'. For example in psql I ran:

select ('/' < '1') as c1,
       ('/1' < '1') as c2,
       ('/////1' < '1') as c3,
       ('/1' < '2') as c4,
       ('/1' < '11') as c5;

The result was:

 c1 | c2 | c3 | c4 | c5
----+----+----+----+----
 t  | f  | f  | t  | t

So '/' is smaller than '1', but '/1' is larger than '1'. In fact '/1' is between '1' and '2' and so is '/////1'. This did not follow lexicographical order.

However, '/1' was (correctly) smaller than 11 which made me even more confused.

I wanted to see if '/1' is considered escaped. So I ran:

select length('/1');

and I got 2, which meant that postgres does treat '/1' as a two character string.

The same problem happens when I replace / with other symbols such as $ or ?.

If you have docker, this problem is easily reproducible by runing a postgres in a docker container:

docker run postgres:11
docker exec  -it `docker ps | grep postgres:11 | cut -d' ' -f 1` psql -U postgres

Then try the above SQLs. I tried postgres 10 image and the behavior is the same.

The same thing happens to a real SQL when I compare a VARCHAR column with a string literal. This problem is driving me crazy since I need to write the correct SQL to compare file paths, which apparently contain many '/' symbols.

I searched and did not find any document talking about this so this doesn't look like a postgres 'official feature'. What is the right way to write a comparison that follows the lexicographical order?

Thanks a lot in advance.

like image 479
Zhan Su Avatar asked Dec 08 '25 09:12

Zhan Su


1 Answers

Postgres uses the operating system's collation (on Linux that would be the ones provided by glibc). So your results depend on the underlying operating system.

  • Rextester seems to run on Windows, and there the comparison works as expected
  • db<>fiddle works on Debian showing the same behaviour as you see
  • db-fiddle.com works on RedHat and there the comparison works as expected as well.

You can force an ASCCI comparison by using the "C" collation (as I did in the above examples):

select '/1' > '1' collate "C"

which seems to work the same on all platforms. Alternatively you can specify an ICU collation that will also work the same across all platforms.


You mentioned you want to compare file paths. One way to do that only on the "names" (ignoring the delimiter") is to convert the path to an array string_to_array(filepath, '/') and then e.g. use that array for sorting or comparison.


Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!