Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL Match ... Against query very slow

Tags:

php

mysql

propel

I work on a website running on Propel ORM and I have this query:

if(isset($args["QueryText"]) && $args["QueryText"] != "") {
  $query = $query
    ->withColumn("(MATCH (Request.Subject, Request.Detail) AGAINST ('" . $args["QueryText"] . "' IN BOOLEAN MODE) + MATCH (Response.Response) AGAINST ('" . $args["QueryText"] . "' IN BOOLEAN MODE))", "RequestRelevance")
    ->condition('cond1', "(MATCH (Request.Subject, Request.Detail) AGAINST ('" . $args["QueryText"] . "' IN BOOLEAN MODE) + MATCH (Response.Response) AGAINST ('" . $args["QueryText"] . "' IN BOOLEAN MODE)) > 0.2")
    ->condition('cond2', 'Request.Id = ?', $args["QueryText"])
    ->where(array('cond1', 'cond2'), 'or')
    ->orderBy("RequestRelevance", Criteria::DESC);
}

which translates to the following in SQL:

SELECT DISTINCT 
(MATCH (requests.subject, requests.detail) AGAINST ('46104' IN BOOLEAN MODE) + 
    MATCH (Response.response) AGAINST ('46104' IN BOOLEAN MODE)) 
  AS RequestRelevance, requests.requestID AS "Id", requests.subject AS "Subject", requests.detail AS "Detail", requests.created AS "CreatedDate", requests.lastresponsedate AS "LastResponseDate", SupportStatus.supportstatusID AS "SupportStatus.Id", SupportStatus.supportstatus AS "SupportStatus.Name", SupportStatus.isnew AS "SupportStatus.IsNew", SupportStatus.isclosed AS "SupportStatus.IsClosed", CustomerGroup.customergroupID AS "CustomerGroup.Id", CustomerGroup.customergroup AS "CustomerGroup.Name", Site.siteID AS "Site.Id", Site.site AS "Site.Name", InternalUser.userID AS "InternalUser.Id", InternalUser.username AS "InternalUser.Username", User.userID AS "User.Id", User.username AS "User.Username", Customer.customerID AS "Customer.Id", Customer.customer AS "Customer.Name", Customer.customergroupID AS "Customer.CustomerGroupId", Customer.rate AS "Customer.Rate" 
FROM requests 
  LEFT JOIN responses Response ON (requests.requestID=Response.requestID) 
  INNER JOIN supportstatus SupportStatus ON (requests.supportstatusID=SupportStatus.supportstatusID) 
  INNER JOIN customergroups CustomerGroup ON (requests.customergroupID=CustomerGroup.customergroupID)
  INNER JOIN customers Customer ON (requests.customerID=Customer.customerID) 
  INNER JOIN sites Site ON (requests.siteID=Site.siteID) 
  LEFT JOIN users InternalUser ON (requests.internaluserID=InternalUser.userID) 
  LEFT JOIN users User ON (requests.userID=User.userID)
WHERE ((MATCH (requests.subject, requests.detail) AGAINST ('46104' IN BOOLEAN MODE) + 
  MATCH (Response.response) AGAINST ('46104' IN BOOLEAN MODE)) > 0.2 OR requests.requestID = '46104') 
ORDER BY requests.created ASC,RequestRelevance DESC

It takes a good 20 second to load on the website using Propel and 7.020 seconds when running the SQL query.

I tried the following one instead:

SELECT DISTINCT 
  requests.requestID AS "Id", requests.subject AS "Subject", requests.detail AS "Detail", requests.created AS "CreatedDate", requests.lastresponsedate AS "LastResponseDate", SupportStatus.supportstatusID AS "SupportStatus.Id", SupportStatus.supportstatus AS "SupportStatus.Name", SupportStatus.isnew AS "SupportStatus.IsNew", SupportStatus.isclosed AS "SupportStatus.IsClosed", CustomerGroup.customergroupID AS "CustomerGroup.Id", CustomerGroup.customergroup AS "CustomerGroup.Name", Site.siteID AS "Site.Id", Site.site AS "Site.Name", InternalUser.userID AS "InternalUser.Id", InternalUser.username AS "InternalUser.Username", User.userID AS "User.Id", User.username AS "User.Username", Customer.customerID AS "Customer.Id", Customer.customer AS "Customer.Name", Customer.customergroupID AS "Customer.CustomerGroupId", Customer.rate AS "Customer.Rate" 
FROM requests 
  LEFT JOIN responses Response ON (requests.requestID=Response.requestID) 
  INNER JOIN supportstatus SupportStatus ON (requests.supportstatusID=SupportStatus.supportstatusID) 
  INNER JOIN customergroups CustomerGroup ON (requests.customergroupID=CustomerGroup.customergroupID)
  INNER JOIN customers Customer ON (requests.customerID=Customer.customerID) 
  INNER JOIN sites Site ON (requests.siteID=Site.siteID) 
  LEFT JOIN users InternalUser ON (requests.internaluserID=InternalUser.userID) 
  LEFT JOIN users User ON (requests.userID=User.userID) 
WHERE (requests.subject LIKE '%46104%' OR requests.detail LIKE '%46104%' OR  Response.response LIKE '%46104%' OR requests.requestID = '46104') 
ORDER BY requests.created

which takes 3.308 seconds to execute. Removing OR Response.response LIKE '%46104%' from it reduces the time to 0.140 seconds. The responses table contains 288317 rows and the responses.responses column is a TEXT() column with a FULLTEXT index.

What would be the best way to reduce the execution time of this search? I have tried using this https://dba.stackexchange.com/questions/15214/why-is-like-more-than-4x-faster-than-match-against-on-a-fulltext-index-in-mysq answer however when I execute:

SELECT responseID FROM
(
 SELECT * FROM responses
 WHERE requestID = 15000
 AND responseID != 84056
) A
WHERE MATCH(response) AGAINST('twisted');

I get this error:

Error Code: 1191. Can't find FULLTEXT index matching the column list

Help will be greatly appreciated!

EDIT 1:

Tried @Richard EB's query:

ALTER TABLE responses ADD FULLTEXT(response)
288317 row(s) affected Records: 288317  Duplicates: 0  Warnings: 0  78.967 sec

However:

SELECT responseID FROM (     SELECT * FROM responses     WHERE requestID = 15000     AND responseID != 84056 ) A WHERE MATCH(response) AGAINST('twisted') LIMIT 0, 2000 
Error Code: 1191. Can't find FULLTEXT index matching the column list    0.000 sec

Removing DISTINCT reduces the execution time to 0.952 seconds however it doesn't retrieve the results I need.

EDIT 2:

Executing this query:

SELECT DISTINCT 
responses.requestID AS "Id" FROM responses WHERE MATCH(response) AGAINST('twisted')

takes 0.062 secs.

Executing this:

SELECT DISTINCT responses.requestID
  FROM responses 
WHERE (
  MATCH (Responses.response) AGAINST ('twisted' IN BOOLEAN MODE)
) 
ORDER BY responses.requestID ASC

only takes 0.046 secs. However, selecting from Requests and joining Responses is what slows down the query. I'm not sure if it means that the whole query has to be completely rewritten to select from Responses and joining Requests instead?

EDIT 3:

Here are the indexes that I have on Requests and Responses tables:

    # Table, Non_unique, Key_name, Seq_in_index, Column_name, Collation, Cardinality, Sub_part, Packed, Null, Index_type, Comment, Index_comment
    responses, 0, PRIMARY, 1, responseID, A, 288317, , , , BTREE, , 
    responses, 1, requestID, 1, requestID, A, 48052, , , YES, BTREE, , 
    responses, 1, response, 1, response, , 1, , , YES, FULLTEXT, , 
    responses, 1, response_2, 1, response, , 1, , , YES, FULLTEXT, , 
    responses, 1, response_3, 1, response, , 1, , , YES, FULLTEXT, , 

    # Table, Non_unique, Key_name, Seq_in_index, Column_name, Collation, Cardinality, Sub_part, Packed, Null, Index_type, Comment, Index_comment
    requests, 0, PRIMARY, 1, requestID, A, 46205, , , , BTREE, , 
    requests, 1, supportstatusID, 1, supportstatusID, A, 14, , , YES, BTREE, , 
    requests, 1, internaluserID, 1, internaluserID, A, 344, , , YES, BTREE, , 
    requests, 1, customergroupID, 1, customergroupID, A, 198, , , , BTREE, , 
    requests, 1, userID, 1, userID, A, 1848, , , YES, BTREE, , 
    requests, 1, siteID, 1, siteID, A, 381, , , YES, BTREE, , 
    requests, 1, request, 1, subject, , 1, , , YES, FULLTEXT, , 
    requests, 1, request, 2, detail, , 1, , , YES, FULLTEXT, , 
    requests, 1, request, 3, ponumber, , 1, , , YES, FULLTEXT, , 
like image 788
Pawel Avatar asked Oct 21 '15 16:10

Pawel


1 Answers

A LIKE search will go through every record and perform non-exact string comparison, which is why it's so slow.

The mysql error you've pasted indicates that the column referenced in the MATCH clause doesn't have a fulltext index (or it does, but it's not as it's referenced in the MATCH clause).

Assuming you're using MyISAM or have MySQL 5.6 and InnoDB, try:

ALTER TABLE responses ADD FULLTEXT(response);

Edit: Answer (in comments): "If there are two columns mentioned in the MATCH statement, then there should be one fulltext index on both of those columns, not two fulltext indexes on each column"

like image 183
Richard EB Avatar answered Oct 17 '22 14:10

Richard EB