SQL Query Performance with duplicate IN parameters

Question

Does having duplicate parameters in the IN clause affect performance of the query compared to eliminating duplicates before executing the query?

SELECT * FROM table WHERE column IN ('A', 'B', 'C', 'A', 'A')

vs

SELECT * FROM table WHERE column IN ('A', 'B', 'C')

I assemble the query programmatically through Java and am weighing whether I should use

A set to prevent duplicates automatically;
Use a list but call contains every time I attempt to insert;
Just add and do not mind the duplicate string data

I'm assuming the performance might not be significant, but would like to know the best practice moving forward.

Sergey Kalinichenko · Accepted Answer

Having duplicates will not decrease performance in a noticeable way, at least not by itself. However, it may have an indirect effect on the performance if the number of items changes between the queries, forcing a re-computation of a query plan on the server side.

Assuming that your query is parameterized, and there is a known limit to the number of IN list elements, it is better to have a fixed number of parameters in a prepared query, and bind NULLs to unused elements of the IN lists, with or without duplicates, than re-generating your query all the time.

If your query is not parameterized (be very careful with that) you would be better off not only eliminating the duplicates, but also ordering your unique items in the same way (say, by using a TreeSet). Otherwise, queries with IN lists of ('A', 'B', 'C') and ('A', 'C', 'B') would be considered different, triggering a re-computation of the query plan.

Another issue that you may run into if you keep duplicates is the maximum length of an IN list. Oracle sets the limit to about a thousand, so a list with duplicates may go past the limit even with the number of unique items well within the allowed maximum.

SQL Query Performance with duplicate IN parameters

Tags:

oracle-database

java

sql

Michael Sanchez

1 Answers

Sergey Kalinichenko

Recent Activity

Donate For Us

SQL Query Performance with duplicate IN parameters

Tags:

oracle-database

java

sql

Michael Sanchez

1 Answers

Sergey Kalinichenko

Related questions

Recent Activity

Donate For Us