Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does an nvarchar always store each character in two bytes?

I had (perhaps naively) assumed that in SQL Server, an nvarchar would store each character in two bytes. But this does not always seem to be the case. The documentation out there suggests that some characters might take more bytes. Does someone have a definitive answer?

like image 602
Rich Avatar asked Jan 17 '11 13:01

Rich


People also ask

How many bytes is a nvarchar?

The minimum size of the NVARCHAR value is 1 byte. The total length of an NVARCHAR variable cannot exceed 65,534 bytes. A variable declared as NVARCHAR without parameters has a maximum size of 1 byte.

How many characters can be stored in 2 bytes?

byte is 1 character. a character in binary is a series of 8 on or offs or 0 or 1s. one of those is a bit and 8 bits make a byte so 1 byte is one character.so 2 bytes hold two characters.

What is default size of nvarchar?

nvarchar [ ( n | max ) ] The storage size is two times n bytes + 2 bytes. For UCS-2 encoding, the storage size is two times n bytes + 2 bytes and the number of characters that can be stored is also n.

Can nvarchar store special characters?

NVARCHAR holds 2 bytes for each character while VARCHAR just 1. Considering this, you can't revert the conversion back from VARCHAR to NVARCHAR for special characters.


2 Answers

yes it does it uses 2 bytes, use datalength to get the storage size, you can't use LEN because LEN just counts the characters, see here: The differences between LEN and DATALENGTH in SQL Server

DECLARE @n NVARCHAR(10)
DECLARE @v VARCHAR(10)

SELECT @n = 'A', @v='A'

SELECT  DATALENGTH(@n),DATALENGTH(@v)

---------
2 1

Here is what Books On Line has: http://msdn.microsoft.com/en-us/library/ms186939.aspx

Character data types that are either fixed-length, nchar, or variable-length, nvarchar, Unicode data and use the UNICODE UCS-2 character set.

nchar [ ( n ) ]

Fixed-length Unicode character data of n characters. n must be a value from 1 through 4,000. The storage size is two times n bytes. The ISO synonyms for nchar are national char and national character.

nvarchar [ ( n | max ) ]

Variable-length Unicode character data. n can be a value from 1 through 4,000. max indicates that the maximum storage size is 2^31-1 bytes. The storage size, in bytes, is two times the number of characters entered + 2 bytes. The data entered can be 0 characters in length. The ISO synonyms for nvarchar are national char varying and national character varying.

That said unicode compression was introduced in SQL Server 2008 R2 so it might store ascii as 1 byte, you can read about unicode compression here

  • SQL Server 2008 R2 : A quick experiment in Unicode Compression
  • SQL Server 2008 R2 : Digging deeper into Unicode compression
  • More testing of Unicode Compression in SQL Server 2008 R2
like image 183
SQLMenace Avatar answered Sep 22 '22 10:09

SQLMenace


Given that there are more than 65536 characters, it should be obvious that a character cannot possibly fit in just two octets (i.e. 16 bits).

SQL Server, like most of Microsoft's products (Windows, .NET, NTFS, …) uses UTF-16 to store text, in which a character takes up either two or four octets, although as @SQLMenace points out, current versions of SQL Server use compression to reduce that.

like image 31
Jörg W Mittag Avatar answered Sep 20 '22 10:09

Jörg W Mittag