Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Speeding up select where column condition exists in another table without duplicates

If I have the following two tables:

  1. Table "a" with 2 columns: id (int) [Primary Index], column1 [Indexed]
  2. Table "b" with 3 columns: id_table_a (int),condition1 (int),condition2 (int) [all columns as Primary Index]

I can run the following query to select rows from Table a where Table b condition1 is 1

SELECT a.id FROM a WHERE EXISTS (SELECT 1 FROM b WHERE b.id_table_a=a.id && condition1=1 LIMIT 1) ORDER BY a.column1 LIMIT 50

With a couple hundred million rows in both tables this query is very slow. If I do:

SELECT a.id FROM a INNER JOIN b ON a.id=b.id_table_a && b.condition1=1  ORDER BY a.column1 LIMIT 50

It is pretty much instant but if there are multiple matching rows in table b that match id_table_a then duplicates are returned. If I do a SELECT DISTINCT or GROUP BY a.id to remove duplicates the query becomes extremely slow.

Here is an SQLFiddle showing the example queries: http://sqlfiddle.com/#!9/35eb9e/10

Is there a way to make a join without duplicates fast in this case?

*Edited to show that INNER instead of LEFT join didn't make much of a difference

*Edited to show moving condition to join did not make much of a difference

*Edited to add LIMIT

*Edited to add ORDER BY

like image 543
JJJ Avatar asked Jul 31 '16 07:07

JJJ


People also ask

How do you select all records from one table that do not exist in another table?

How to Select All Records from One Table That Do Not Exist in Another Table in SQL? We can get the records in one table that doesn't exist in another table by using NOT IN or NOT EXISTS with the subqueries including the other table in the subqueries.

Which is faster select or select column?

SELECT field is faster than select *. Because if you have more than 1 field/column in your table then select * will return all of those, and that requires network bandwidth and more work for the database to fetch all the other fields.

Do Joins slow down query?

Joins: If your query joins two tables in a way that substantially increases the row count of the result set, your query is likely to be slow. There's an example of this in the subqueries lesson. Aggregations: Combining multiple rows to produce a result requires more computation than simply retrieving those rows.

How do you check if a record exists in another table SQL?

How do you check if a table contains any data in SQL? Using EXISTS clause in the IF statement to check the existence of a record. Using EXISTS clause in the CASE statement to check the existence of a record. Using EXISTS clause in the WHERE clause to check the existence of a record.


2 Answers

You can try with inner join and distinct

SELECT distinct a.id 
FROM a INNER JOIN b ON a.id=b.id_table_a AND b.condition1=1

but using distinct on select * be sure you don't distinct id that return wrong result in this case use

SELECT distinct col1, col2, col3 .... 
FROM a INNER JOIN b ON a.id=b.id_table_a AND b.condition1=1

You could also add a composite index with use also condtition1 eg: key(id, condition1)

if you can you could also perform a

 ANALYZE TABLE table_name; 

on both the table ..

and another technique is try to reverting the lead table

SELECT distinct a.id 
FROM b INNER JOIN a ON a.id=b.id_table_a AND b.condition1=1

Using the most selective table for lead the query

Using this seem different the use of index http://sqlfiddle.com/#!9/35eb9e/15 (the last add a using where)

# USING DISTINCT TO REMOVE DUPLICATES without col  and order 
 EXPLAIN 
 SELECT DISTINCT a.id 
 FROM a 
 INNER JOIN b ON a.id=b.id_table_a AND b.condition1=1
;
like image 79
ScaisEdge Avatar answered Oct 01 '22 20:10

ScaisEdge


It looks like I found the answer.

SELECT a.id FROM a 
INNER JOIN b ON 
    b.id_table_a=a.id && 
    b.condition1=1 && 
    b.condition2=(select b.condition2 from b WHERE b.id_table_a=a.id && b.condition1=1 LIMIT 1)
ORDER BY a.column1
LIMIT 5;

I don't know if there is a flaw in this or not, please let me know if so. If anyone has a way to compress this somehow I will gladly accept your answer.

like image 32
JJJ Avatar answered Oct 01 '22 18:10

JJJ