I wish to add a unique constraint to a table using the following query -
ALTER TABLE events ADD CONSTRAINT events_timestamp_request_session_key UNIQUE (event_timestamp, request_id, session_id);
The query is valid, however on one database the existing data does not meet the constraint, and so the query fails -
ERROR: could not create unique index "events_timestamp_request_session_key"
DETAIL: Key (event_timestamp, request_id, session_id)=(2017-07-05 14:53:25.475246+00, a55df6-8533e322cd-aa9d57-87e2, 132443) is duplicated.
It is expected that there would be a few duplicates, but unfortunately I cannot simply delete or alter them.
Is there any way to add the constraint as required, while ignoring the existing data in the table?
You can use a partial index for this, its not a particularly nice solution but it will work until you can correct your old data.
Something like:
CREATE UNIQUE INDEX events_timestamp_request_session_key
ON events (event_timestamp, request_id, session_id)
WHERE event_timestamp >= '2017-07-01'::timestamp;
where the time is the start of your clean data.
The where clause limits the index to only looking at records with a newer event timestamp. The old records are excluded from the index entirely so don't get considered for uniqueness checks.
Doc: https://www.postgresql.org/docs/9.6/static/indexes-partial.html
I don't think there is a built-in method for doing this. You can, however, create a unique index using an expression. Let me assume that you have a serial unique id in each table:
create unique index unq_events_timestamp_request_session_key
on (event_timestamp, request_id, session_id,
(case when event_id < 99999 then event_id else -1 end)
);
The expression essentially says: "if this key is already in the table, then ignore it for the unique constraint".
You could enforce the unique constraint for all but the current duplicates by eliminating them:
create unique index unq_events_timestamp_request_session_key
on (event_timestamp, request_id, session_id,
(case when event_id in (1, 2, 3, 5, 8) then event_id
else -1
end)
);
This would require research into the current duplicates. Actually, you can also do either with a filter clause:
create unique index unq_events_timestamp_request_session_key
on (event_timestamp, request_id, session_id)
where event_id > 999999 ;
create unique index unq_events_timestamp_request_session_key
on (event_timestamp, request_id, session_id)
where event_id not in (1, 2, 3, 5, 8) ;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With