Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is postgres 9.5+ on conflict(primary key) faster than where not exists?

Tags:

postgresql

Let's say you have any insert query as below -

insert into abc select * from def dd where not exists (select 1 from abc aa where aa.id = dd.id);

Vs

insert into abc select * from def dd on conflict(id) do nothing;

which one is faster?

Both tables are really large and the idea is to not insert if there is a conflict with the primary key

like image 730
Tisha Avatar asked Mar 10 '23 16:03

Tisha


2 Answers

Which is faster is not really relevant. Actually, both should be very comparable, because both will be looking up the value using a unique index.

More important is that NOT EXISTS is not semantically correct. Race conditions could cause two queries to both fail the NOT EXISTS . . . and then both would try to insert the same value.

So, for safety, I strongly advise using the ON CONFLICT formulation. It ensures that the database ensure the consistency of the data.

like image 59
Gordon Linoff Avatar answered Mar 13 '23 05:03

Gordon Linoff


Which is faster often depends on specifics. A blanket answer is not usually possible. And while they are subtly different they are close enough that they may be substitutes in a given circumstance.

The first is an antijoin while the second is a simple select with instructions to continue on conflict. This means that the planner may be able to use information to effectively run it faster. So in general, NOT EXISTS should perform better.

But this is not always the case. If the two tables are big enough, you might have a more expensive join strategy and that could be a problem.

like image 27
Chris Travers Avatar answered Mar 13 '23 05:03

Chris Travers