Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unexpected result for string comparison

I cannot understand the following behavior:

WITH tests(min, val, max) AS (
    SELECT 'a', 'x', 'z' UNION ALL
    SELECT '',  'x', 'z' UNION ALL
    SELECT 'a', 'x', '~'
)
SELECT min, val, max, CASE WHEN val BETWEEN min AND max THEN 'PASS' ELSE 'FAIL' END AS result
FROM tests

Result:

| min | val | max | result |
|-----|-----|-----|--------|
| a   | x   | z   | PASS   |
|     | x   | z   | PASS   |
| a   | x   | ~   | FAIL   |

The character codes for x and ~ are 120 and 126 respectively. In every programming language I know 'x' < '~' is true. So what is with SQL?

I get same result on different RDBMS. On SQL Server the following collation is used:

Latin1-General, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive

like image 967
Salman A Avatar asked Jul 27 '18 12:07

Salman A


1 Answers

In my comment beneath your question I asked you what collation you were using. You said "assume default", but there is no "default". The "default" depends on how your database and server are setup. I ran your experiment on my SQL Server and happened to get the same results you did, but that was merely a coincidence.

The SQL Server and database I ran the experiment on is using the SQL_Latin1_General_CP1_CI_AS collation. Based on this fact, here is the ordering of characters:

http://collation-charts.org/mssql/mssql.0409.1252.Latin1_General_CI_AS.html

Notice that Tilde ~ comes before alphabetical characters which obviously includes x.

like image 83
rory.ap Avatar answered Sep 19 '22 13:09

rory.ap