Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Complex LIKE Query extremely slow againt NVARCHAR(450), compared to VARCHAR(450)

I'm currently wondering about some performance differences on VARCHAR/NVARCHAR, especially when using Complex LIKE queries (that start with _ or %).

I have a testsetup on Microsoft SQL Server 2014. I have 2 Tables. Both have an ID Field (identity(1, 1), and a Value Field (either VARCHAR(450) OR NVARCHAR(450)). Both have the identical 1'000'000 randomly generated entries.

The tables are named tblVarCharNoIndex and tblNVarCharNoIndex (so, there are no indexes. The behaviour is nearly the same, if I use indexes).

Now, I execute the following queries test the duration (once on VARCHAR; once on NVARCHAR)

SELECT * FROM tblVarcharNoIndex WHERE Value LIKE '%ab%'
SELECT * FROM tblNVarcharNoIndex WHERE Value LIKE '%ab%'

The execution times are wildly different. It takes 1540ms on the VARCHAR Table, and 8630 ms on the NVARCHAR Table, so it takes over 5x longer with NVARCHAR.

I understand, that NVARCHAR has performance implications, since it needs 2 bytes to store, this totally makes sense. But I can't explain a performance degradation by 500%, this makes no sense to me.

As per Request, here some more Data.

Query for table creation

CREATE TABLE [dbo].[tblVarcharNoIndex](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [Value] [varchar](450) NOT NULL,
 CONSTRAINT [PK_tblVarcharNoIndex] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]


CREATE TABLE [dbo].[tblNVarcharNoIndex](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [Value] [nvarchar](450) NOT NULL,
 CONSTRAINT [PK_tblNVarcharNoIndex] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

Query for value generation

DECLARE @cnt INT = 0;
DECLARE @entries INT = 1000000 --1'000'000;
DECLARE @maxLength INT = 450;
DECLARE @minLength INT = 50;
DECLARE @value VARCHAR(450)
DECLARE @length INT

WHILE @cnt < @entries
BEGIN
    SELECT @value = ''
    SET @length = @minLength + CAST(RAND() * (@maxLength - @minLength) as INT)
    WHILE @length <> 0
    BEGIN
        SELECT @value = @value + CHAR(CAST(RAND() * 96 + 32 as INT))
        SET @length = @length - 1
    END

    INSERT INTO tblBase(Value, NValue) VALUES (@value, @value)

    SET @cnt = @cnt + 1;

END;

(Values are copied later from tblBase)

LIKE Query in question

DECLARE @start DATETIME
DECLARE @end DATETIME
DECLARE @testname NVARCHAR(100) = 'INSERT FROM other table'

--VARCHAR No Index
PRINT 'starting ''' + @testname + ''' on VARCHAR (No Index)'
SET @start = GETDATE()

SELECT * FROM tblVarcharNoIndex WHERE Value LIKE '%ab%' --This takes 1540ms

SET @end = GETDATE()
PRINT '-- finished ''' + @testname + ''' on VARCHAR (No Index)'
PRINT '-- Duration ' + CAST(DATEDIFF(mcs, @start, @end) AS VARCHAR(100)) + ' microseconds'


--NVARCHAR No Index
PRINT 'starting ''' + @testname + ''' on NVARCHAR (No Index)'
SET @start = GETDATE()

SELECT * FROM tblNVarcharNoIndex WHERE Value LIKE '%ab%' --This takes 8630ms

SET @end = GETDATE()
PRINT '-- finished ''' + @testname + ''' on NVARCHAR (No Index)'
PRINT '-- Duration ' + CAST(DATEDIFF(mcs, @start, @end) AS VARCHAR(100)) + ' microseconds'

Execution Plan The execution plan for both the queries looks exactly the same (I have nowhere to upload the image right now, but it's really simple):

SELECT (0%) <--- Parallelism (Gather Streams) (3%) <--- Clustered Index Scan ON Primary Key (97%)

like image 600
crazy_crank Avatar asked Jul 25 '16 13:07

crazy_crank


1 Answers

It will not be possible to give you specifics without more data, starting with an execution plan for both queries.

Some general reasons:

  1. As you stated there are twice as many bytes to read when doing a scan
  2. The number of page loads will increase
  3. The amount of necessary memory will increase, which can cause overflow to disk operations
  4. The amount of CPU will can increase which might be capped based on OS or SQL settings and be causing CPU waits.
like image 115
Joe C Avatar answered Oct 13 '22 01:10

Joe C