Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL/PHP Search Efficiency

I'm trying to create a small search for my site. I've tried using full-text index search, but I could never get it to work. Here is what I've come up with:

if(isset($_GET['search'])) {

$search = str_replace('-', ' ', $_GET['search']);
$result = array();

$titles = mysql_query("SELECT title FROM Entries WHERE title LIKE '%$search%'");
while($row = mysql_fetch_assoc($titles)) {
    $result[] = $row['title'];
}

$tags = mysql_query("SELECT title FROM Entries WHERE tags LIKE '%$search%'");
while($row = mysql_fetch_assoc($tags)) {
    $result[] = $row['title'];
}

$text = mysql_query("SELECT title FROM Entries WHERE entry LIKE '%$search%'");
while($row = mysql_fetch_assoc($text)) {
    $result[] = $row['title'];
}

$result = array_unique($result);
}

So basically, it searches through all the titles, body-text, and tags of all the entries in the DB. This works decently well, but I'm just wondering how efficient would it be? This would only be for a small blog, too. Either way I'm just wondering if this could be made any more efficient.

like image 239
williamg Avatar asked Jun 01 '10 23:06

williamg


Video Answer


1 Answers

There's no way to make LIKE '%pattern%' queries efficient. Once you get a nontrivial amount of data, using those wildcard queries performs hundreds or thousands of times slower than using a fulltext indexing solution.

You should look at the presentation I did for MySQL University: http://www.slideshare.net/billkarwin/practical-full-text-search-with-my-sql

Here's how to get it to work:

  1. First make sure your table uses the MyISAM storage engine. MySQL FULLTEXT indexes support only MyISAM tables. (edit 11/1/2012: MySQL 5.6 is introducing a FULLTEXT index type for InnoDB tables.)

    ALTER TABLE Entries ENGINE=MyISAM;
    
  2. Create a fulltext index.

    CREATE FULLTEXT INDEX searchindex ON Entries(title, tags, entry);
    
  3. Search it!

    $search = mysql_real_escape_string($search);
    $titles = mysql_query("SELECT title FROM Entries 
        WHERE MATCH(title, tags, entry) AGAINST('$search')");
    while($row = mysql_fetch_assoc($titles)) {
        $result[] = $row['title'];
    }
    

    Note that the columns you name in the MATCH clause must be the same columns in the same order as those you declared in the fulltext index definition. Otherwise it won't work.


I've tried using full-text index search, but I could never get it to work... I'm just wondering if this could be made any more efficient.

This is exactly like saying, "I couldn't figure out how to use this chainsaw, so I decided to cut down this redwood tree with a pocketknife. How can I make that work as well as the chainsaw?"


Regarding your comment about searching for words that match more than 50% of the rows.

The MySQL manual says this:

Users who need to bypass the 50% limitation can use the boolean search mode; see Section 11.8.2, “Boolean Full-Text Searches”.

And this:

The 50% threshold for natural language searches is determined by the particular weighting scheme chosen. To disable it, look for the following line in storage/myisam/ftdefs.h:

#define GWS_IN_USE GWS_PROB

Change that line to this:

#define GWS_IN_USE GWS_FREQ

Then recompile MySQL. There is no need to rebuild the indexes in this case.

Also, you might be searching for stopwords. These are words that are ignored by the fulltext search because they're too common. Words like "the" and so on. See http://dev.mysql.com/doc/refman/5.1/en/fulltext-stopwords.html

like image 82
Bill Karwin Avatar answered Oct 30 '22 01:10

Bill Karwin