Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Speed up a query, simple inner join with one large table and one small table

I have a table T1 with 60 rows and 5 columns: ID1, ID2, info1, info2, info3.

I have a table T2 with 1.2 million rows and another 5 columns: ID3, ID2, info4, info5, info6.

I want to get (ID1, ID2, info4, info5, info6) from all the rows where the ID2s match up. Currently my query looks like this:

SELECT T1.ID1, T2.ID2,
       T2.info4, T2.info5, T2.info6
  FROM T1, T2
 WHERE T1.ID2 = T2.ID2;

This takes about 15 seconds to run. My question is - should it take that long, and if not, how can I speed it up? I figure it shouldn't since T1 is so small.

I asked PostgreSQL to EXPLAIN the query, and it says that it hashes T2, then hash joins that hash with T1. It seems hashing T2 is what takes so long. Is there any way to write the query so it doesn't have to hash T2? Or, is there a way to have it cache the hash of T2 so it doesn't re-do it? The tables will only be updated every few days.

If it makes a difference, T1 is a temporary table created earlier in the session.

like image 321
Claudiu Avatar asked Jun 09 '10 15:06

Claudiu


People also ask

What is faster one big query or many small queries SQL?

However, in the context of the question, a single large query will be faster that, let's say -in the worse possible scenario- a SELECT inside a programming loop (no matter the RDBMS used).

What is the most efficient way of joining 2 table in same database?

Method 1: Relational Algebra Relational algebra is the most common way of writing a query and also the most natural way to do so. The code is clean, easy to troubleshoot, and unsurprisingly, it is also the most efficient way to join two tables.


1 Answers

It should not take that long :)

Creating an index on T2( ID2 ) should improve the performance of your query:

CREATE INDEX idx_t2_id2 ON t2 (id2);
like image 78
Peter Lang Avatar answered Nov 14 '22 22:11

Peter Lang