Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server: COLLATE CS->CI performance vs UPPER

What's better to use when comparing columns that could contains values with different cases? Is Collate faster? Just trying to figure this out because in our project with tons of data comparisons. we use:

select * from table t1, table t2 where 
t1.col1 collate SQL_Latin1_General_CP1_CI_AS = t2.colb collate SQL_Latin1_General_CP1_CI_AS 

why we can't rewrite as:

select * from table t1, table t2 where 
UPPER(t1.col1)  = UPPER(t2.colb)
like image 701
zmische Avatar asked Sep 01 '10 09:09

zmische


People also ask

What are different types of collation sensitivity in SQL Server?

The options associated with a collation are case sensitivity, accent sensitivity, kana sensitivity, width sensitivity, and variation-selector sensitivity.

What is the difference between SQL_Latin1_General_CP1_CI_AS and Latin1_General_CI_AS?

The SQL_Latin1_General_CP1_CI_AS collation is a SQL collation and the rules around sorting data for unicode and non-unicode data are different. The Latin1_General_CI_AS collation is a Windows collation and the rules around sorting unicode and non-unicode data are the same.

What is collate SQL_Latin1_General_CP1_CI_AS in SQL Server?

The collate clause is used for case sensitive and case insensitive searches in the columns of the SQL server. There are two types of collate clause present: SQL_Latin1_General_CP1_CS_AS for case sensitive. SQL_Latin1_General_CP1_CI_AS for case insensitive.

What is collate Latin1_General_CS_AS in SQL Server?

According the SQL Server Books Online, the characters included in range searches depend on the sorting rules of the collation. Collation Latin1_General_CS_AS uses dictionary order so both upper and lower case characters of the specified range are included.


1 Answers

Have you tried them?

I'd suggest they'd be equally bad because an index can't be used because of the function/COLLATE. The overhead of one of COLLATE or UPPER will be tiny compared to scanning a large table.

I would use UPPER personally because it's more obvious. The use of UPPER would also tell me the columns are case sensitive anyway.

Edit, Oct 2012

UPPER has the advantage of being collation independent.

Using COLLATE requires to force a code page, basically. UPPER won't. So you can keep your intended sort/compare logic rather than forcing standard Latin.

like image 161
gbn Avatar answered Oct 05 '22 02:10

gbn