Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How significant is the performance difference when joining on nvarchar versus on int

I understand that join on nvarchar is slower because index is bigger as nvarchar using 2 bytes for each character but int is 4 bytes all the time. Is the join performance difference significant? Is there any strong reason to avoid join on nvarchar? I couldn't find any MSDN article about the topic.

like image 278
Andras Csehi Avatar asked Jun 26 '11 12:06

Andras Csehi


People also ask

Does NVARCHAR Max affect performance?

No. I was referring to the data size being returned. The size of the nvarchar definition has no impact on the result.

Should I always use NVARCHAR?

You should use NVARCHAR / NCHAR whenever the ENCODING , which is determined by COLLATION of the field, doesn't support the characters needed. Also, depending on the SQL Server version, you can use specific COLLATIONs , like Latin1_General_100_CI_AS_SC_UTF8 which is available since SQL Server 2019.

Is NVARCHAR faster than varchar?

Each character of an nvarchar column requires 2 bytes of storage whereas a varchar column requires 1 byte per character. Potentially, varchar will be quicker but that may well mean that you cannot store the data that you need.

Why should you choose the NVARCHAR data type over varchar?

The key difference between varchar and nvarchar is the way they are stored, varchar is stored as regular 8-bit data(1 byte per character) and nvarchar stores data at 2 bytes per character. Due to this reason, nvarchar can hold upto 4000 characters and it takes double the space as SQL varchar.


1 Answers

At least 8x CPU. This is the measurable increase in comparing nvarchar over varchar: unicode sorting and comparison rules are more complex that straight varchar.

  • What are the main performance differences between varchar and nvarchar SQL Server data types?
  • SQL Server uses high CPU when searching inside nvarchar strings

So, assuming varchar and int are equal (they aren't) nvarchar will have overhead compared to int

Then, byte for byte ('1234' vs 1234) you're comparing 10 bytes vs 4 bytes. This also means a wider key for less index and data entries per page = more IO.

Finally, if your nvarchar is more then 450 characters, you can't index it because index key is max 900 bytes wide.

like image 161
gbn Avatar answered Oct 24 '22 22:10

gbn