I want to have a unique constraint on a column which I am going to populate with GUIDs. However, my data contains null values for this columns. How do I create the constraint that allows multiple null values?
Here's an example scenario. Consider this schema:
CREATE TABLE People (
  Id INT CONSTRAINT PK_MyTable PRIMARY KEY IDENTITY,
  Name NVARCHAR(250) NOT NULL,
  LibraryCardId UNIQUEIDENTIFIER NULL,
  CONSTRAINT UQ_People_LibraryCardId UNIQUE (LibraryCardId)
)
Then see this code for what I'm trying to achieve:
-- This works fine:
INSERT INTO People (Name, LibraryCardId) 
 VALUES ('John Doe', 'AAAAAAAA-AAAA-AAAA-AAAA-AAAAAAAAAAAA');
-- This also works fine, obviously:
INSERT INTO People (Name, LibraryCardId) 
VALUES ('Marie Doe', 'BBBBBBBB-BBBB-BBBB-BBBB-BBBBBBBBBBBB');
-- This would *correctly* fail:
--INSERT INTO People (Name, LibraryCardId) 
--VALUES ('John Doe the Second', 'AAAAAAAA-AAAA-AAAA-AAAA-AAAAAAAAAAAA');
-- This works fine this one first time:
INSERT INTO People (Name, LibraryCardId) 
VALUES ('Richard Roe', NULL);
-- THE PROBLEM: This fails even though I'd like to be able to do this:
INSERT INTO People (Name, LibraryCardId) 
VALUES ('Marcus Roe', NULL);
The final statement fails with a message:
Violation of UNIQUE KEY constraint 'UQ_People_LibraryCardId'. Cannot insert duplicate key in object 'dbo.People'.
How can I change my schema and/or uniqueness constraint so that it allows multiple NULL values, while still checking for uniqueness on actual data?
You can insert NULL values into columns with the UNIQUE constraint because NULL is the absence of a value, so it is never equal to other NULL values and not considered a duplicate value.
There are way to create a unique index that allows multiple nulls in MS SQL Server, but it's not the default. The default in MS is to allow just one NULL. But that's not the ANSI standard. ANSI standards 92, 99, and 03 explicitly allow multiple nulls with unique.
Key Differences Between Primary key and Unique key: Primary key will not accept NULL values whereas Unique key can accept NULL values.
As you know, when you create a UNIQUE constraint on a nullable column, SQL Server allows only one NULL value, thereby maintaining the UNIQUEness. However, there are situations when we need more than one NULL value in the column but still have to maintain uniqueness, ignoring all those NULL values.
What you're looking for is indeed part of the ANSI standards SQL:92, SQL:1999 and SQL:2003, ie a UNIQUE constraint must disallow duplicate non-NULL values but accept multiple NULL values.
In the Microsoft world of SQL Server however, a single NULL is allowed but multiple NULLs are not...
In SQL Server 2008, you can define a unique filtered index based on a predicate that excludes NULLs:
CREATE UNIQUE NONCLUSTERED INDEX idx_yourcolumn_notnull
ON YourTable(yourcolumn)
WHERE yourcolumn IS NOT NULL;
In earlier versions, you can resort to VIEWS with a NOT NULL predicate to enforce the constraint.
You can create a unique index that accept multiple NULLs with a WHERE clause. See the answer below.
You cannot create a UNIQUE constraint and allow NULLs. You need set a default value of NEWID().
Update the existing values to NEWID() where NULL before creating the UNIQUE constraint.
SQL Server 2008 And Up
Just filter a unique index:
CREATE UNIQUE NONCLUSTERED INDEX UQ_Party_SamAccountName
ON dbo.Party(SamAccountName)
WHERE SamAccountName IS NOT NULL;
In Lower Versions, A Materialized View Is Still Not Required
For SQL Server 2005 and earlier, you can do it without a view. I just added a unique constraint like you're asking for to one of my tables. Given that I want uniqueness in column SamAccountName, but I want to allow multiple NULLs, I used a materialized column rather than a materialized view:
ALTER TABLE dbo.Party ADD SamAccountNameUnique
   AS (Coalesce(SamAccountName, Convert(varchar(11), PartyID)))
ALTER TABLE dbo.Party ADD CONSTRAINT UQ_Party_SamAccountName
   UNIQUE (SamAccountNameUnique)
You simply have to put something in the computed column that will be guaranteed unique across the whole table when the actual desired unique column is NULL. In this case, PartyID is an identity column and being numeric will never match any SamAccountName, so it worked for me. You can try your own method—be sure you understand the domain of your data so that there is no possibility of intersection with real data. That could be as simple as prepending a differentiator character like this:
Coalesce('n' + SamAccountName, 'p' + Convert(varchar(11), PartyID))
Even if PartyID became non-numeric someday and could coincide with a SamAccountName, now it won't matter.
Note that the presence of an index including the computed column implicitly causes each expression result to be saved to disk with the other data in the table, which DOES take additional disk space.
Note that if you don't want an index, you can still save CPU by making the expression be precalculated to disk by adding the keyword PERSISTED to the end of the column expression definition.
In SQL Server 2008 and up, definitely use the filtered solution instead if you possibly can!
Controversy
Please note that some database professionals will see this as a case of "surrogate NULLs", which definitely have problems (mostly due to issues around trying to determine when something is a real value or a surrogate value for missing data; there can also be issues with the number of non-NULL surrogate values multiplying like crazy).
However, I believe this case is different. The computed column I'm adding will never be used to determine anything. It has no meaning of itself, and encodes no information that isn't already found separately in other, properly defined columns. It should never be selected or used.
So, my story is that this is not a surrogate NULL, and I'm sticking to it! Since we don't actually want the non-NULL value for any purpose other than to trick the UNIQUE index to ignore NULLs, our use case has none of the problems that arise with normal surrogate NULL creation.
All that said, I have no problem with using an indexed view instead—but it brings some issues with it such as the requirement of using SCHEMABINDING. Have fun adding a new column to your base table (you'll at minimum have to drop the index, and then drop the view or alter the view to not be schema bound). See the full (long) list of requirements for creating an indexed view in SQL Server (2005) (also later versions), (2000).
Update
If your column is numeric, there may be the challenge of ensuring that the unique constraint using Coalesce does not result in collisions. In that case, there are some options. One might be to use a negative number, to put the "surrogate NULLs" only in the negative range, and the "real values" only in the positive range. Alternately, the following pattern could be used. In table Issue (where IssueID is the PRIMARY KEY), there may or may not be a TicketID, but if there is one, it must be unique.
ALTER TABLE dbo.Issue ADD TicketUnique
   AS (CASE WHEN TicketID IS NULL THEN IssueID END);
ALTER TABLE dbo.Issue ADD CONSTRAINT UQ_Issue_Ticket_AllowNull
   UNIQUE (TicketID, TicketUnique);
If IssueID 1 has ticket 123, the UNIQUE constraint will be on values (123, NULL). If IssueID 2 has no ticket, it will be on (NULL, 2). Some thought will show that this constraint cannot be duplicated for any row in the table, and still allows multiple NULLs.
For people who are using Microsoft SQL Server Manager and want to create a Unique but Nullable index you can create your unique index as you normally would then in your Index Properties for your new index, select "Filter" from the left hand panel, then enter your filter (which is your where clause). It should read something like this:
([YourColumnName] IS NOT NULL)
This works with MSSQL 2012
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With