Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there something wrong with joins that don't use the JOIN keyword in SQL or MySQL?

Tags:

sql

join

mysql

People also ask

Can you join two tables without using the keyword join?

Yes, Tables Can Be Joined Without the JOIN Keyword As you have just seen, it's not always necessary to use the JOIN keyword to combine two tables in SQL.

Can we use joins without foreign key?

You don't need a FK, you can join arbitrary columns. But having a foreign key ensures that the join will actually succeed in finding something. Foreign key give you certain guarantees that would be extremely difficult and error prone to implement otherwise.

Which join we can use without a join condition?

We can use 'cross join' without on condition. Cross join gives the result in cartesian product form. For instance, if in one table there are 3 records and another table has 2 records, then the first record will match with all the second table records.

What can I use instead of join in SQL?

Subqueries allow you to use the results of another query in the outer query. In some cases, subqueries can replace complex joins and unions.


Filtering joins solely using WHERE can be extremely inefficient in some common scenarios. For example:

SELECT * FROM people p, companies c 
    WHERE p.companyID = c.id AND p.firstName = 'Daniel'

Most databases will execute this query quite literally, first taking the Cartesian product of the people and companies tables and then filtering by those which have matching companyID and id fields. While the fully-unconstrained product does not exist anywhere but in memory and then only for a moment, its calculation does take some time.

A better approach is to group the constraints with the JOINs where relevant. This is not only subjectively easier to read but also far more efficient. Thusly:

SELECT * FROM people p JOIN companies c ON p.companyID = c.id
    WHERE p.firstName = 'Daniel'

It's a little longer, but the database is able to look at the ON clause and use it to compute the fully-constrained JOIN directly, rather than starting with everything and then limiting down. This is faster to compute (especially with large data sets and/or many-table joins) and requires less memory.

I change every query I see which uses the "comma JOIN" syntax. In my opinion, the only purpose for its existence is conciseness. Considering the performance impact, I don't think this is a compelling reason.


The more verbose INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, FULL OUTER JOIN are from the ANSI SQL/92 syntax for joining. For me, this verbosity makes the join more clear to the developer/DBA of what the intent is with the join.


In SQL Server there are always query plans to check, a text output can be made as follows:

SET SHOWPLAN_ALL ON
GO

DECLARE @TABLE_A TABLE
(
    ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
    Data VARCHAR(10) NOT NULL
)
INSERT INTO @TABLE_A
SELECT 'ABC' UNION 
SELECT 'DEF' UNION
SELECT 'GHI' UNION
SELECT 'JKL' 

DECLARE @TABLE_B TABLE
(
    ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
    Data VARCHAR(10) NOT NULL
)
INSERT INTO @TABLE_B
SELECT 'ABC' UNION 
SELECT 'DEF' UNION
SELECT 'GHI' UNION
SELECT 'JKL' 

SELECT A.Data, B.Data
FROM
    @TABLE_A AS A, @TABLE_B AS B
WHERE
    A.ID = B.ID

SELECT A.Data, B.Data
FROM
    @TABLE_A AS A
    INNER JOIN @TABLE_B AS B ON A.ID = B.ID

Now I'll omit the plan for the table variable creates, the plan for both queries is identical though:

 SELECT A.Data, B.Data  FROM   @TABLE_A AS A, @TABLE_B AS B  WHERE   A.ID = B.ID
  |--Nested Loops(Inner Join, OUTER REFERENCES:([A].[ID]))
       |--Clustered Index Scan(OBJECT:(@TABLE_A AS [A]))
       |--Clustered Index Seek(OBJECT:(@TABLE_B AS [B]), SEEK:([B].[ID]=@TABLE_A.[ID] as [A].[ID]) ORDERED FORWARD)
 SELECT A.Data, B.Data  FROM   @TABLE_A AS A   INNER JOIN @TABLE_B AS B ON A.ID = B.ID
  |--Nested Loops(Inner Join, OUTER REFERENCES:([A].[ID]))
       |--Clustered Index Scan(OBJECT:(@TABLE_A AS [A]))
       |--Clustered Index Seek(OBJECT:(@TABLE_B AS [B]), SEEK:([B].[ID]=@TABLE_A.[ID] as [A].[ID]) ORDERED FORWARD)

So, short answer - No need to rewrite, unless you spend a long time trying to read them each time you maintain them?


It's more of a syntax choice. I prefer grouping my join conditions with my joins, hence I use the INNER JOIN syntax

SELECT a.someRow, b.someRow
FROM tableA AS a
INNER JOIN tableB AS b
  ON a.ID = b.ID
WHERE b.ID = ?

(? being a placeholder)