Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to use a like with a join in sql?

Using INSTR:

SELECT *
  FROM TABLE a
  JOIN TABLE b ON INSTR(b.column, a.column) > 0

Using LIKE:

SELECT *
  FROM TABLE a
  JOIN TABLE b ON b.column LIKE '%'+ a.column +'%'

Using LIKE, with CONCAT:

SELECT *
  FROM TABLE a
  JOIN TABLE b ON b.column LIKE CONCAT('%', a.column ,'%')

Mind that in all options, you'll probably want to drive the column values to uppercase BEFORE comparing to ensure you are getting matches without concern for case sensitivity:

SELECT *
  FROM (SELECT UPPER(a.column) 'ua'
         TABLE a) a
  JOIN (SELECT UPPER(b.column) 'ub'
         TABLE b) b ON INSTR(b.ub, a.ua) > 0

The most efficient will depend ultimately on the EXPLAIN plan output.

JOIN clauses are identical to writing WHERE clauses. The JOIN syntax is also referred to as ANSI JOINs because they were standardized. Non-ANSI JOINs look like:

SELECT *
  FROM TABLE a,
       TABLE b
 WHERE INSTR(b.column, a.column) > 0

I'm not going to bother with a Non-ANSI LEFT JOIN example. The benefit of the ANSI JOIN syntax is that it separates what is joining tables together from what is actually happening in the WHERE clause.


In MySQL you could try:

SELECT * FROM A INNER JOIN B ON B.MYCOL LIKE CONCAT('%', A.MYCOL, '%');

Of course this would be a massively inefficient query because it would do a full table scan.

Update: Here's a proof


create table A (MYCOL varchar(255));
create table B (MYCOL varchar(255));
insert into A (MYCOL) values ('foo'), ('bar'), ('baz');
insert into B (MYCOL) values ('fooblah'), ('somethingfooblah'), ('foo');
insert into B (MYCOL) values ('barblah'), ('somethingbarblah'), ('bar');
SELECT * FROM A INNER JOIN B ON B.MYCOL LIKE CONCAT('%', A.MYCOL, '%');
+-------+------------------+
| MYCOL | MYCOL            |
+-------+------------------+
| foo   | fooblah          |
| foo   | somethingfooblah |
| foo   | foo              |
| bar   | barblah          |
| bar   | somethingbarblah |
| bar   | bar              |
+-------+------------------+
6 rows in set (0.38 sec)

If this is something you'll need to do often...then you may want to denormalize the relationship between tables A and B.

For example, on insert to table B, you could write zero or more entries to a juncion table mapping B to A based on partial mapping. Similarly, changes to either table could update this association.

This all depends on how frequently tables A and B are modified. If they are fairly static, then taking a hit on INSERT is less painful then repeated hits on SELECT.


Using conditional criteria in a join is definitely different than the Where clause. The cardinality between the tables can create differences between Joins and Where clauses.

For example, using a Like condition in an Outer Join will keep all records in the first table listed in the join. Using the same condition in the Where clause will implicitly change the join to an Inner join. The record has to generally be present in both tables to accomplish the conditional comparison in the Where clause.

I generally use the style given in one of the prior answers.

tbl_A as ta
    LEFT OUTER JOIN tbl_B AS tb
            ON ta.[Desc] LIKE '%' + tb.[Desc] + '%'

This way I can control the join type.


When writing queries with our server LIKE or INSTR (or CHARINDEX in T-SQL) takes too long, so we use LEFT like in the following structure:

select *
from little
left join big
on left( big.key, len(little.key) ) = little.key

I understand that might only work with varying endings to the query, unlike other suggestions with '%' + b + '%', but is enough and much faster if you only need b+'%'.

Another way to optimize it for speed (but not memory) is to create a column in "little" that is "len(little.key)" as "lenkey" and user that instead in the query above.