Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Add a unique constraint but ignore existing table data

I wish to add a unique constraint to a table using the following query -

ALTER TABLE events ADD CONSTRAINT events_timestamp_request_session_key UNIQUE (event_timestamp, request_id, session_id);

The query is valid, however on one database the existing data does not meet the constraint, and so the query fails -

ERROR: could not create unique index "events_timestamp_request_session_key"
DETAIL: Key (event_timestamp, request_id, session_id)=(2017-07-05 14:53:25.475246+00, a55df6-8533e322cd-aa9d57-87e2, 132443) is duplicated.

It is expected that there would be a few duplicates, but unfortunately I cannot simply delete or alter them.

Is there any way to add the constraint as required, while ignoring the existing data in the table?

like image 598
David Gard Avatar asked Jul 20 '17 10:07

David Gard


2 Answers

You can use a partial index for this, its not a particularly nice solution but it will work until you can correct your old data.

Something like:

CREATE UNIQUE INDEX events_timestamp_request_session_key
ON events (event_timestamp, request_id, session_id)
WHERE event_timestamp >= '2017-07-01'::timestamp;

where the time is the start of your clean data.

The where clause limits the index to only looking at records with a newer event timestamp. The old records are excluded from the index entirely so don't get considered for uniqueness checks.

Doc: https://www.postgresql.org/docs/9.6/static/indexes-partial.html

like image 191
Gary Avatar answered Oct 10 '22 09:10

Gary


I don't think there is a built-in method for doing this. You can, however, create a unique index using an expression. Let me assume that you have a serial unique id in each table:

create unique index unq_events_timestamp_request_session_key 
    on (event_timestamp, request_id, session_id,
        (case when event_id < 99999 then event_id else -1 end)
       ); 

The expression essentially says: "if this key is already in the table, then ignore it for the unique constraint".

You could enforce the unique constraint for all but the current duplicates by eliminating them:

create unique index unq_events_timestamp_request_session_key 
    on (event_timestamp, request_id, session_id,
        (case when event_id in (1, 2, 3, 5, 8) then event_id
              else -1
         end)
       ); 

This would require research into the current duplicates. Actually, you can also do either with a filter clause:

create unique index unq_events_timestamp_request_session_key 
    on (event_timestamp, request_id, session_id)
    where event_id > 999999 ;

create unique index unq_events_timestamp_request_session_key 
    on (event_timestamp, request_id, session_id)
    where event_id not in (1, 2, 3, 5, 8) ;
like image 31
Gordon Linoff Avatar answered Oct 10 '22 08:10

Gordon Linoff