I'm trying to find an intuitive way of enforcing mutual uniqueness across two columns in a table. I am not looking for composite uniqueness, where duplicate combinations of keys are disallowed; rather, I want a rule where any of the keys cannot appear again in either column. Take the following example:
CREATE TABLE Rooms
(
Id INT NOT NULL PRIMARY KEY,
)
CREATE TABLE Occupants
(
PersonName VARCHAR(20),
LivingRoomId INT NULL REFERENCES Rooms (Id),
DiningRoomId INT NULL REFERENCES Rooms (Id),
)
A person may pick any room as their living room, and any other room as their dining room. Once a room has been allocated to an occupant, it cannot be allocated again to another person (whether as a living room or as a dining room).
I'm aware that this issue can be resolved through data normalization; however, I cannot change the schema make breaking changes to the schema.
Update: In response to the proposed answers:
Two unique constraints (or two unique indexes) will not prevent duplicates across the two columns. Similarly, a simple LivingRoomId != DiningRoomId
check constraint will not prevent duplicates across rows. For example, I want the following data to be forbidden:
INSERT INTO Rooms VALUES (1), (2), (3), (4)
INSERT INTO Occupants VALUES ('Alex', 1, 2)
INSERT INTO Occupants VALUES ('Lincoln', 2, 3)
Room 2 is occupied simultaneously by Alex (as a living room) and by Lincoln (as a dining room); this should not be allowed.
Update2: I've run some tests on the three main proposed solutions, timing how long they would take to insert 500,000 rows into the Occupants
table, with each row having a pair of random unique room ids.
Extending the Occupants
table with unique indexes and a check constraint (that calls a scalar function) causes the insert to take around three times as long. The implementation of the scalar function is incomplete, only checking that new occupants' living room does not conflict with existing occupants' dining room. I couldn't get the insert to complete in reasonable time if the reverse check was performed as well.
Adding a trigger that inserts each occupant's room as a new row into another table decreases performance by 48%. Similarly, an indexed view takes 43% longer. In my opinion, using an indexed view is cleaner, since it avoids the need for creating another table, as well as allows SQL Server to automatically handle updates and deletes as well.
The full scripts and results from the tests are given below:
SET STATISTICS TIME OFF
SET NOCOUNT ON
CREATE TABLE Rooms
(
Id INT NOT NULL PRIMARY KEY IDENTITY(1,1),
RoomName VARCHAR(10),
)
CREATE TABLE Occupants
(
Id INT NOT NULL PRIMARY KEY IDENTITY(1,1),
PersonName VARCHAR(10),
LivingRoomId INT NOT NULL REFERENCES Rooms (Id),
DiningRoomId INT NOT NULL REFERENCES Rooms (Id)
)
GO
DECLARE @Iterator INT = 0
WHILE (@Iterator < 10)
BEGIN
INSERT INTO Rooms
SELECT TOP (1000000) 'ABC'
FROM sys.all_objects s1 WITH (NOLOCK)
CROSS JOIN sys.all_objects s2 WITH (NOLOCK)
CROSS JOIN sys.all_objects s3 WITH (NOLOCK);
SET @Iterator = @Iterator + 1
END;
DECLARE @RoomsCount INT = (SELECT COUNT(*) FROM Rooms);
SELECT TOP 1000000 RoomId
INTO ##RandomRooms
FROM
(
SELECT DISTINCT
CAST(RAND(CHECKSUM(NEWID())) * @RoomsCount AS INT) + 1 AS RoomId
FROM sys.all_objects s1 WITH (NOLOCK)
CROSS JOIN sys.all_objects s2 WITH (NOLOCK)
) s
ALTER TABLE ##RandomRooms
ADD Id INT IDENTITY(1,1)
SELECT
'XYZ' AS PersonName,
R1.RoomId AS LivingRoomId,
R2.RoomId AS DiningRoomId
INTO ##RandomOccupants
FROM ##RandomRooms R1
JOIN ##RandomRooms R2
ON R2.Id % 2 = 0
AND R2.Id = R1.Id + 1
GO
PRINT CHAR(10) + 'Test 1: No integrity check'
CHECKPOINT;
DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
SET NOCOUNT OFF
SET STATISTICS TIME ON
INSERT INTO Occupants
SELECT *
FROM ##RandomOccupants
SET STATISTICS TIME OFF
SET NOCOUNT ON
TRUNCATE TABLE Occupants
PRINT CHAR(10) + 'Test 2: Unique indexes and check constraint'
CREATE UNIQUE INDEX UQ_LivingRoomId
ON Occupants (LivingRoomId)
CREATE UNIQUE INDEX UQ_DiningRoomId
ON Occupants (DiningRoomId)
GO
CREATE FUNCTION CheckExclusiveRoom(@occupantId INT)
RETURNS BIT AS
BEGIN
RETURN
(
SELECT CASE WHEN EXISTS
(
SELECT *
FROM Occupants O1
JOIN Occupants O2
ON O1.LivingRoomId = O2.DiningRoomId
-- OR O1.DiningRoomId = O2.LivingRoomId
WHERE O1.Id = @occupantId
)
THEN 0
ELSE 1
END
)
END
GO
ALTER TABLE Occupants
ADD CONSTRAINT ExclusiveRoom
CHECK (dbo.CheckExclusiveRoom(Id) = 1)
CHECKPOINT;
DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
SET NOCOUNT OFF
SET STATISTICS TIME ON
INSERT INTO Occupants
SELECT *
FROM ##RandomOccupants
SET STATISTICS TIME OFF
SET NOCOUNT ON
ALTER TABLE Occupants DROP CONSTRAINT ExclusiveRoom
DROP INDEX UQ_LivingRoomId ON Occupants
DROP INDEX UQ_DiningRoomId ON Occupants
DROP FUNCTION CheckExclusiveRoom
TRUNCATE TABLE Occupants
PRINT CHAR(10) + 'Test 3: Insert trigger'
CREATE TABLE RoomTaken
(
RoomId INT NOT NULL PRIMARY KEY REFERENCES Rooms (Id)
)
GO
CREATE TRIGGER UpdateRoomTaken
ON Occupants
AFTER INSERT
AS
INSERT INTO RoomTaken
SELECT RoomId
FROM
(
SELECT LivingRoomId AS RoomId
FROM INSERTED
UNION ALL
SELECT DiningRoomId AS RoomId
FROM INSERTED
) s
GO
CHECKPOINT;
DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
SET NOCOUNT OFF
SET STATISTICS TIME ON
INSERT INTO Occupants
SELECT *
FROM ##RandomOccupants
SET STATISTICS TIME OFF
SET NOCOUNT ON
DROP TRIGGER UpdateRoomTaken
DROP TABLE RoomTaken
TRUNCATE TABLE Occupants
PRINT CHAR(10) + 'Test 4: Indexed view with unique index'
CREATE TABLE TwoRows
(
Id INT NOT NULL PRIMARY KEY
)
INSERT INTO TwoRows VALUES (1), (2)
GO
CREATE VIEW OccupiedRooms
WITH SCHEMABINDING
AS
SELECT RoomId = CASE R.Id WHEN 1
THEN O.LivingRoomId
ELSE O.DiningRoomId
END
FROM dbo.Occupants O
CROSS JOIN dbo.TwoRows R
GO
CREATE UNIQUE CLUSTERED INDEX UQ_OccupiedRooms
ON OccupiedRooms (RoomId);
CHECKPOINT;
DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
SET NOCOUNT OFF
SET STATISTICS TIME ON
INSERT INTO Occupants
SELECT *
FROM ##RandomOccupants
SET STATISTICS TIME OFF
SET NOCOUNT ON
DROP INDEX UQ_OccupiedRooms ON OccupiedRooms
DROP VIEW OccupiedRooms
DROP TABLE TwoRows
TRUNCATE TABLE Occupants
DROP TABLE ##RandomRooms
DROP TABLE ##RandomOccupants
DROP TABLE Occupants
DROP TABLE Rooms
/* Results:
Test 1: No integrity check
SQL Server Execution Times:
CPU time = 5210 ms, elapsed time = 10853 ms.
(500000 row(s) affected)
Test 2: Unique indexes and check constraint
SQL Server Execution Times:
CPU time = 21996 ms, elapsed time = 27019 ms.
(500000 row(s) affected)
Test 3: Insert trigger
SQL Server parse and compile time:
CPU time = 5663 ms, elapsed time = 11192 ms.
SQL Server Execution Times:
CPU time = 4914 ms, elapsed time = 4913 ms.
(1000000 row(s) affected)
SQL Server Execution Times:
CPU time = 10577 ms, elapsed time = 16105 ms.
(500000 row(s) affected)
Test 4: Indexed view with unique index
SQL Server Execution Times:
CPU time = 10171 ms, elapsed time = 15777 ms.
(500000 row(s) affected)
*/
We implement this using CASE WHEN ensconced in the CREATE UNIQUE INDEX statement: SQL> -- Conditional unique index on multiple columns SQL> create unique index demo_fbi_idx 2 on demo_fbi 3 (case when active_flag = 'Y' then 4 col1 else null end, 5 case when active_flag = 'Y' then 6 col2 else null end); Index created.
Both the UNIQUE and PRIMARY KEY constraints provide a guarantee for uniqueness for a column or set of columns. A PRIMARY KEY constraint automatically has a UNIQUE constraint. However, you can have many UNIQUE constraints per table, but only one PRIMARY KEY constraint per table.
A unique index ensures the index key columns do not contain any duplicate values. A unique index may consist of one or many columns. If a unique index has one column, the values in this column will be unique. In case the unique index has multiple columns, the combination of values in these columns is unique.
Yes, you can define UNIQUE constraints to columns other than the primary key in order to ensure the data is unique between rows.
I think the only way to do this is to use constraint and a Function.
Pseudo code (haven't done this for a long time):
CREATE FUNCTION CheckExlusiveRoom
RETURNS bit
declare @retval bit
set @retval = 0
select retval = 1
from Occupants as Primary
join Occupants as Secondary
on Primary.LivingRoomId = Secondary.DiningRoomId
where Primary.ID <> Secondary.ID
or ( Primary.DiningRoomId= Secondary.DiningRoomId
or Primary.LivingRoomId = Secondary.LivingRoomID)
return @retval
GO
Then, use this function in a check constraint....
Alternative would be to use an intermediate table OccupiedRoom, where you would always insert rooms that are used (by trigger for instance? ) and FK to it instead of Room table
Reaction to comment:
Do you need to enforce it directly on the table, or is a constraint violation happening in reaction to insert/update enough? Because then I am thinking like this:
create a simple table:
create table RoomTaken (RoomID int primary key references Room (Id) )
create a trigger on insert/update/delete, that makes sure that any Room used in Occupants is kept in RoomID as well.
If you try to duplicate room usage, RoomTaken table will throw an PK violation
Not sure if this is enough and/or how it would compare in speed to the UDF (I assume it would be superior).
And yes, I see the problem that RoomTaken would not FK to the usage in Occupants, but... really, you are working under some constraints and there is no perfect solution - it's speed (UDF) vs 100% integrity enforcement in my opinion.
You could create an "external" constraint in the form of an indexed view:
CREATE VIEW dbo.OccupiedRooms
WITH SCHEMABINDING
AS
SELECT r.Id
FROM dbo.Occupants AS o
INNER JOIN dbo.Rooms AS r ON r.Id IN (o.LivingRoomId, o.DiningRoomId)
;
GO
CREATE UNIQUE CLUSTERED INDEX UQ_1 ON dbo.OccupiedRooms (Id);
The view is essentially unpivoting the occupied rooms' IDs, putting them all in one column. The unique index on that column makes sure it does not have duplicates.
Here are demonstrations of how this method works:
failed insert;
successful insert.
UPDATE
As hvd has correctly remarked, the above solution does not catch attempts to insert identical LivingRoomId
and DiningRoomId
when they are put on the same row. This is because the dbo.Rooms
table is matched only once in that case and, therefore, the join produces produces just one row for the pair of references.
One way to fix that is suggested in the same comment: additionally to the indexed view, use a CHECK constraint on the dbo.OccupiedRooms
table to prohibit rows with identical room IDs. The suggested LivingRoomId <> DiningRoomId
condition, however, will not work for cases where both columns are NULL. To account for that case, the condition could be expanded to this one:
LivingRoomId <> DinindRoomId AND (LivingRoomId IS NOT NULL OR DinindRoomId IS NOT NULL)
Alternatively, you could change the view's SELECT statement to catch all situations. If LivingRoomId
and DinindRoomId
were NOT NULL
columns, you could avoid a join to dbo.Rooms
and unpivot the columns using a cross-join to a virtual 2-row table:
SELECT Id = CASE x.r WHEN 1 THEN o.LivingRoomId ELSE o.DiningRoomId END
FROM dbo.Occupants AS o
CROSS
JOIN (SELECT 1 UNION ALL SELECT 2) AS x (r)
However, as those columns allow NULLs, this method would not allow you to insert more than one single-reference row. To make it work in your case, you would need to filter out NULL entries, but only if they come from rows where the other reference is not NULL. I believe adding the following WHERE clause to the above query would suffice:
WHERE o.LivingRoomId IS NULL AND o.DinindRoomId IS NULL
OR x.r = 1 AND o.LivingRoomId IS NOT NULL
OR x.r = 2 AND o.DinindRoomId IS NOT NULL
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With