I'm using ASCII function for getting equivalent ASCII code of two characters, but I'm surprised when seeing there is no difference between 'ي' and 'ی', can anyone help me?
SELECT ASCII('ي'), ASCII('ی')
However here are the settings we typically recommend: Set the default collation to SQL_Latin1_General_CP1_CI_AS. Ensure that you have SQL Server running in Case Insensitive mode. We use NCHAR, NVARCHAR string types so all data is unicode, so no character set is specified.
The collate clause is used for case sensitive and case insensitive searches in the columns of the SQL server. There are two types of collate clause present: SQL_Latin1_General_CP1_CS_AS for case sensitive. SQL_Latin1_General_CP1_CI_AS for case insensitive.
A collation specifies the bit patterns that represent each character in a dataset. Collations also determine the rules that sort and compare data. SQL Server supports storing objects that have different collations in a single database.
Because your character is non Unicode you have to use UNICODE()
function instead of ASCII()
.
SELECT ASCII('ي'), ASCII('ی')
will result: 237
, 237
but
SELECT UNICODE(N'ي'), UNICODE(N'ی')
will result: 1610
, 1740
Try this
SELECT UNICODE(N'ي'), UNICODE(N'ی')
Another solution by using the proper collate in case you want to use Ascii
Arabic_CS_AS_KS
result will come as ى = 236 and ي= 237
This is a limitation ASCII
function. According to the documentation, ASCII
:
Returns the ASCII code value of the leftmost character of a character expression.
However, the characters in your question are made up of more than one byte. It appears that ASCII
can only read one byte.
When you use these characters as string literals without the N
prefix, they are treated as single-byte characters. The following query shows that SQL Server does not treat these characters as equal in the Arabic_CI_AS collation when they are properly marked as multi-byte:
SELECT CASE WHEN 'ي' COLLATE Arabic_CI_AS <> 'ی' COLLATE Arabic_CI_AS
THEN 1 ELSE 0 END AS are_different_ascii,
CASE WHEN N'ي' COLLATE Arabic_CI_AS <> N'ی' COLLATE Arabic_CI_AS
THEN 1 ELSE 0 END AS are_different_unicode
The following query shows the bytes that make up the characters:
SELECT CAST(N'ي' COLLATE Arabic_CI_AS as varbinary(4)),
CAST(N'ی' COLLATE Arabic_CI_AS as varbinary(4)),
CAST('ي' COLLATE Arabic_CI_AS as varbinary(4)),
CAST('ی' COLLATE Arabic_CI_AS as varbinary(4))
However, even when you mark the characters as unicode, the ASCII
function returns the same value because it can only read one byte:
SELECT ASCII(N'ي' COLLATE Arabic_CI_AS) , ASCII(N'ی' COLLATE Arabic_CI_AS)
EDIT As TT. points out, these characters don't have an entry in the ASCII code table.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With