Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding multiple conditions to MySQL Inner Join

Tags:

sql

mysql

the goal of the query is also to find possible duplicates of names that were mistyped. Example:

International Group Inc. must be find as a duplicate of International, Group Inc

In order to accomplish this a used the next query:

SELECT C.id,
       C.name,
       C.address,
       C.city_id
FROM   company C
       INNER JOIN (SELECT name
                   FROM   company
                   GROUP  BY name
                   HAVING Count(id) > 1) D
               ON Replace(Replace(C.name, '.', ''), ',', '') =
                  Replace(Replace(D.name, '.', ''), ',', '')  

It works very well and the result came at 40 secs but adding an extra condition like AND C.city_id='4' requires an extra minute or more; This is still acceptable but not preferable.

My real problem occurs when I try to add another condition to find out only duplicates of companies that have a specific string in the name, using this condition AND C.name LIKE '%International%', this just don't return any results.

Could somebody help me figure out what I am doing wrong?

Thanks

like image 484
gustyaquino Avatar asked May 07 '13 12:05

gustyaquino


People also ask

Can inner join have 2 conditions?

How do you inner join on two conditions? you have to use the AND for subsequent AND join criteria: SELECT * FROM EMPLOYEE.

How do you join a table in two conditions?

You join two tables by creating a relationship in the WHERE clause between at least one column from one table and at least one column from another. The join creates a temporary composite table where each pair of rows (one from each table) that satisfies the join condition is linked to form a single row.

How do you join three tables with conditions?

An SQL query can JOIN three tables (or more). Simply add an extra JOIN condition for the third table. 3-Table JOINs work with SELECT, UPDATE, and DELETE queries.

Does inner join return multiple matches?

An inner join returns all rows from x with matching values in y, and all columns from both x and y. If there are multiple matches between x and y, all match combinations are returned.


1 Answers

Because you are joining on the result of a function, the query cannot use any index. Besides, the cost of executing the REPLACE() on all rows is probably not negligible.

I suggest you first add an indexed column that receives the "stripped-down" version of the strings, and then run the query with a join on this column:

ALTER TABLE company ADD COLUMN stripped_name VARCHAR(50);
ALTER TABLE company ADD INDEX(stripped_name);
UPDATE TABLE company SET stripped_name = REPLACE(REPLACE(name, '.', ''), ',', '') ;

Running the UPDATE could take a while the first time, but you could also set an ON UPDATE and an ON INSERT triggers on company so that stripped_name gets populated and update on-the-fly.

like image 160
RandomSeed Avatar answered Oct 02 '22 00:10

RandomSeed