If I have the following two tables: <ol> <li>Table "a" with 2 columns: id (int) [Primary Index], column1 [Indexed]</li> <li>Table "b" with 3 columns: id_table_a (int),condition1 (int),condition2 (int) [all columns as Primary Index]</li> </ol> I can run the following query to select rows from Table a where Table b condition1 is 1 <pre class="prettyprint"><code>SELECT a.id FROM a WHERE EXISTS (SELECT 1 FROM b WHERE b.id_table_a=a.id && condition1=1 LIMIT 1) ORDER BY a.column1 LIMIT 50 </code></pre> With a couple hundred million rows in both tables this query is very slow. If I do: <pre class="prettyprint"><code>SELECT a.id FROM a INNER JOIN b ON a.id=b.id_table_a && b.condition1=1 ORDER BY a.column1 LIMIT 50 </code></pre> It is pretty much instant but if there are multiple matching rows in table b that match id_table_a then duplicates are returned. If I do a SELECT DISTINCT or GROUP BY a.id to remove duplicates the query becomes extremely slow. Here is an SQLFiddle showing the example queries: http://sqlfiddle.com/#!9/35eb9e/10 Is there a way to make a join without duplicates fast in this case? *Edited to show that INNER instead of LEFT join didn't make much of a difference *Edited to show moving condition to join did not make much of a difference *Edited to add LIMIT *Edited to add ORDER BY

You can try with inner join and distinct <pre class="prettyprint"><code>SELECT distinct a.id FROM a INNER JOIN b ON a.id=b.id_table_a AND b.condition1=1 </code></pre> but using distinct on select * be sure you don't distinct id that return wrong result in this case use <pre class="prettyprint"><code>SELECT distinct col1, col2, col3 .... FROM a INNER JOIN b ON a.id=b.id_table_a AND b.condition1=1 </code></pre> You could also add a composite index with use also condtition1 eg: key(id, condition1) if you can you could also perform a <pre class="prettyprint"><code> ANALYZE TABLE table_name; </code></pre> on both the table .. and another technique is try to reverting the lead table <pre class="prettyprint"><code>SELECT distinct a.id FROM b INNER JOIN a ON a.id=b.id_table_a AND b.condition1=1 </code></pre> Using the most selective table for lead the query Using this seem different the use of index http://sqlfiddle.com/#!9/35eb9e/15 (the last add a using where) <pre class="prettyprint"><code># USING DISTINCT TO REMOVE DUPLICATES without col and order EXPLAIN SELECT DISTINCT a.id FROM a INNER JOIN b ON a.id=b.id_table_a AND b.condition1=1 ; </code></pre>

It looks like I found the answer. <pre class="prettyprint"><code>SELECT a.id FROM a INNER JOIN b ON b.id_table_a=a.id && b.condition1=1 && b.condition2=(select b.condition2 from b WHERE b.id_table_a=a.id && b.condition1=1 LIMIT 1) ORDER BY a.column1 LIMIT 5; </code></pre> I don't know if there is a flaw in this or not, please let me know if so. If anyone has a way to compress this somehow I will gladly accept your answer.

Speeding up select where column condition exists in another table without duplicates

Tags:

sql

database

mysql

If I have the following two tables:

Table "a" with 2 columns: id (int) [Primary Index], column1 [Indexed]
Table "b" with 3 columns: id_table_a (int),condition1 (int),condition2 (int) [all columns as Primary Index]

I can run the following query to select rows from Table a where Table b condition1 is 1

Click to copy

SELECT a.id FROM a WHERE EXISTS (SELECT 1 FROM b WHERE b.id_table_a=a.id && condition1=1 LIMIT 1) ORDER BY a.column1 LIMIT 50

With a couple hundred million rows in both tables this query is very slow. If I do:

Click to copy

SELECT a.id FROM a INNER JOIN b ON a.id=b.id_table_a && b.condition1=1  ORDER BY a.column1 LIMIT 50

It is pretty much instant but if there are multiple matching rows in table b that match id_table_a then duplicates are returned. If I do a SELECT DISTINCT or GROUP BY a.id to remove duplicates the query becomes extremely slow.

Here is an SQLFiddle showing the example queries: http://sqlfiddle.com/#!9/35eb9e/10

Is there a way to make a join without duplicates fast in this case?

*Edited to show that INNER instead of LEFT join didn't make much of a difference

*Edited to show moving condition to join did not make much of a difference

*Edited to add LIMIT

*Edited to add ORDER BY

543

asked Jul 31 '16 07:07

JJJ

2 Answers

You can try with inner join and distinct

Click to copy

SELECT distinct a.id 
FROM a INNER JOIN b ON a.id=b.id_table_a AND b.condition1=1

but using distinct on select * be sure you don't distinct id that return wrong result in this case use

Click to copy

SELECT distinct col1, col2, col3 .... 
FROM a INNER JOIN b ON a.id=b.id_table_a AND b.condition1=1

You could also add a composite index with use also condtition1 eg: key(id, condition1)

if you can you could also perform a

Click to copy

 ANALYZE TABLE table_name;

on both the table ..

and another technique is try to reverting the lead table

Click to copy

SELECT distinct a.id 
FROM b INNER JOIN a ON a.id=b.id_table_a AND b.condition1=1

Using the most selective table for lead the query

Using this seem different the use of index http://sqlfiddle.com/#!9/35eb9e/15 (the last add a using where)

Click to copy

# USING DISTINCT TO REMOVE DUPLICATES without col  and order 
 EXPLAIN 
 SELECT DISTINCT a.id 
 FROM a 
 INNER JOIN b ON a.id=b.id_table_a AND b.condition1=1
;

answered Oct 01 '22 20:10

ScaisEdge

It looks like I found the answer.

Click to copy

SELECT a.id FROM a 
INNER JOIN b ON 
    b.id_table_a=a.id && 
    b.condition1=1 && 
    b.condition2=(select b.condition2 from b WHERE b.id_table_a=a.id && b.condition1=1 LIMIT 1)
ORDER BY a.column1
LIMIT 5;

I don't know if there is a flaw in this or not, please let me know if so. If anyone has a way to compress this somehow I will gladly accept your answer.

answered Oct 01 '22 18:10

JJJ

Related questions
                            
                                Laravel Crypt - Comparing Values
                            
                                How do I automatically clear output in MySQL workbench?
                            
                                How to display data from mysql using angular.js PHP?
                            
                                After DB version change, index won't be used automatically
                            
                                Alter the LAST_INSERT_ID() from within a TRIGGER in MySQL
                            
                                A strange deadlock in Mysql
                            
                                Retrieve image from mysql-php(android)
                            
                                Select 2 rows from table 1(with UNION) if pair doesn't exist in table 2
                            
                                How can i query two different tables in mysql
                            
                                Long running mysql "cleaning up" transaction
                            
                                MySQL Workbench Connection Encoding
                            
                                How to log queries to stdout on MySQL?
                            
                                What is the difference between MySQL and MariaDB database?
                            
                                Migration warnings - Truncated key column length for column x to x - SQL Server to MYSQL Migration
                            
                                Getting geojson linestring from MySQL geometry WKT data
                            
                                Continuous aggregates over large datasets
                            
                                mysql 5.7.10 performance 3 Times Slower vs 5.6.28
                            
                                Loop Multidimensional array to generate Multidimensional Array for Google Charts
                            
                                Sequelize joining two tables which are not associated
                            
                                MySQL NDB API AccessViolationException

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Speeding up select where column condition exists in another table without duplicates

Tags:

sql

database

mysql

JJJ

People also ask

2 Answers

ScaisEdge

JJJ

Recent Activity

Donate For Us