Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server ignore_dup_key on apparently ignored for merge statement

SQL Server 2008.

I have few pieces of code working together and the effect is rather strange so I decided to share.

I have a table with emails - for simplicity let as assume there are 2 columns: ID identity(1,1) int primary key email varchar(900)

Also there is a unique key on email column with IGNORE_DUP_KEY = ON.

On the other side of a problem is a merge statement:

merge into dbo.email
using (
     select distinct email t from #t
) p 
ON t = email
when not matched by target then
insert (email) values  (t);

It is worth to note that #t is basically (for purpose of this query) a one column table with email varchar(500).

What is surprising is that the query failed:

"Violation of UNIQUE KEY constraint 'uq_email'. Cannot insert duplicate key in object dbo.email"

However this works flawlessly:

insert into dbo.email (email) select email from #t

While I can obviously work around the problem (the insert statement is faster anyway so it is a way to go) I have no idea why the MERGE statement failed. Any ideas, anyone?

EDIT: Full use case: Step 1:

create table #temp (
col1 varchar(500),
col2 varchar(500),
col3 varchar(500),
col4 varchar(500),
col5 varchar(500),
email varchar(500),
id int)

Step 2:

#temp is populated from CSV file for many purposes

Step 3:

merge into dbo.email using ( select distinct email t from #temp ) p ON t = email when not matched by target then insert (email) values (t);

Step 0 - CREATE script for dbo.email:

CREATE TABLE dbo.email (
id int identity(1,1) not null,
email varchar(900) null,
loaddate date default null,
constraint [PK__email__1111] PRIMARY KEY CLUSTERED
(
    id asc
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON PRIMARY,
CONSTRAINT [uq_email] UNIQUE NONCLUSTERED
(
     EMAIL asc
)
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = ON, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON PRIMARY
) ON PRIMARY

I can't make much more without giving you hunderds of gigabytes of data which is out of a question obviously.

like image 715
nimdil Avatar asked Jan 17 '23 12:01

nimdil


1 Answers

Your email column allows for null values. If you have a null value in target and a null value in source they will not be matched in the on expression. You will have a unique constraint exception when you are inserting the second null value.

Try this instead:

merge into dbo.email
using (
     select distinct email t from #t where email is not null
) p 
ON t = email
when not matched by target then
insert (email) values  (t);

Update:
Regarding ignore_dup_key you should read the remarks section in documentation for the merge statement:

If IGNORE_DUP_KEY is set to ON for any unique indexes on the target table, MERGE ignores this setting.

To have a unique constraint in SQL Server that allows for multiple null values you should add a unique filtered index instead.

create unique index UX_xx on TableName(ColName) where ColName is not null
like image 126
Mikael Eriksson Avatar answered Jan 29 '23 09:01

Mikael Eriksson