Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PATINDEX with letter range exclude diacritics (accented characters)

I am trying to figure out how to use a patindex to find a range of letter characters, but exclude accented characters. If I do a straight search, using the default collate (insensitive) works just fine. However, when I search a range of letters, it will match on the accented character

SELECT
    IIF('Ú' = 'U' COLLATE Latin1_General_CI_AI, 'Match', 'No') AS MatchInsensitive,
    IIF('Ú' = 'U' COLLATE Latin1_General_CI_AS, 'Match', 'No') AS MatchSensitive,
    PATINDEX('%[A-Z]%', 'Ú' COLLATE Latin1_General_CI_AI)      AS PIInsensitive,
    PATINDEX('%[A-Z]%', 'Ú' COLLATE Latin1_General_CI_AS)      AS PISensitive

Will give the following results:

MatchInsensitive MatchSensitive PIInsensitive PISensitive
---------------- -------------- ------------- -----------
Match            No             1             1

What I am really trying to do is to identify the character position of accented characters in a string, so I was really searching for PATINDEX('%[^A-Z0-9 ]%').

If I have the following query, I would expect a result of 2 SELECT PATINDEX('%[^A-Z0-9 ]%', 'médico'), but I get 0.

like image 504
RPh_Coder Avatar asked Mar 30 '17 20:03

RPh_Coder


1 Answers

You could use a binary collation, e.g. Latin1_General_100_BIN2.

select patindex('%[^a-zA-Z0-9 ]%', 'médico' collate Latin1_General_100_BIN2)

rextester: http://rextester.com/ZICLN98474

returns 2

like image 105
SqlZim Avatar answered Nov 14 '22 15:11

SqlZim