I had (perhaps naively) assumed that in SQL Server, an nvarchar would store each character in two bytes. But this does not always seem to be the case. The documentation out there suggests that some characters might take more bytes. Does someone have a definitive answer?
The minimum size of the NVARCHAR value is 1 byte. The total length of an NVARCHAR variable cannot exceed 65,534 bytes. A variable declared as NVARCHAR without parameters has a maximum size of 1 byte.
byte is 1 character. a character in binary is a series of 8 on or offs or 0 or 1s. one of those is a bit and 8 bits make a byte so 1 byte is one character.so 2 bytes hold two characters.
nvarchar [ ( n | max ) ] The storage size is two times n bytes + 2 bytes. For UCS-2 encoding, the storage size is two times n bytes + 2 bytes and the number of characters that can be stored is also n.
NVARCHAR holds 2 bytes for each character while VARCHAR just 1. Considering this, you can't revert the conversion back from VARCHAR to NVARCHAR for special characters.
yes it does it uses 2 bytes, use datalength to get the storage size, you can't use LEN because LEN just counts the characters, see here: The differences between LEN and DATALENGTH in SQL Server
DECLARE @n NVARCHAR(10)
DECLARE @v VARCHAR(10)
SELECT @n = 'A', @v='A'
SELECT DATALENGTH(@n),DATALENGTH(@v)
---------
2 1
Here is what Books On Line has: http://msdn.microsoft.com/en-us/library/ms186939.aspx
Character data types that are either fixed-length, nchar, or variable-length, nvarchar, Unicode data and use the UNICODE UCS-2 character set.
nchar [ ( n ) ]
Fixed-length Unicode character data of n characters. n must be a value from 1 through 4,000. The storage size is two times n bytes. The ISO synonyms for nchar are national char and national character.
nvarchar [ ( n | max ) ]
Variable-length Unicode character data. n can be a value from 1 through 4,000. max indicates that the maximum storage size is 2^31-1 bytes. The storage size, in bytes, is two times the number of characters entered + 2 bytes. The data entered can be 0 characters in length. The ISO synonyms for nvarchar are national char varying and national character varying.
That said unicode compression was introduced in SQL Server 2008 R2 so it might store ascii as 1 byte, you can read about unicode compression here
Given that there are more than 65536 characters, it should be obvious that a character cannot possibly fit in just two octets (i.e. 16 bits).
SQL Server, like most of Microsoft's products (Windows, .NET, NTFS, …) uses UTF-16 to store text, in which a character takes up either two or four octets, although as @SQLMenace points out, current versions of SQL Server use compression to reduce that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With