Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mysql fulltext search relevance across multiple tables

I have been tasked with creating a site wide search feature. The search needs to look at articles, events and page content

I've used MATCH()/AGAINST() in MySQL before and know how to get the relevance of a result but as far as I know the relevance is unique to the search (contents, number of rows etc) the relevance of results from the articles table wont match the relevance of results from the events table.

Is there anyway to unify the relevance so that results from all three tables have a comparable relevance?

like image 942
michael Avatar asked Jan 26 '12 13:01

michael


People also ask

What is advantage of fulltext over like for performing text search in MySQL?

Using the LIKE operator gives you 100% precision with no concessions for recall. A full text search facility gives you a lot of flexibility to tune down the precision for better recall. Most full text search implementations use an "inverted index".

How do I enable fulltext search in MySQL?

MySQL Full-Text Search is enabled by adding a FULLTEXT index to your searchable fields. You then use MATCH ... AGAINST with one of three searching modes to get your results. Natural language queries return a search relevance score which you can use to rank your results.

For what can the fulltext indexes be created for?

A full-text index is a special type of index that provides index access for full-text queries against character or binary column data. A full-text index breaks the column into tokens and these tokens make up the index data.

What is against in MySQL?

AGAINST takes a string to search for, and an optional modifier that indicates what type of search to perform. The search string must be a string value that is constant during query evaluation. This rules out, for example, a table column because that can differ for each row.


1 Answers

Yes, you can unify them very well using a search engine such as Apache Lucene and Solr.

http://lucene.apache.org/solr/

If you need to do it only in MySQL, you can do this with a UNION. You'll probably want to suppress any zero-relevant results.

You'll need to decide how you want to affect the relevance depending on which table matches.

For example, suppose you want articles to be most important, events to be medium important, and pages to be least important. You can use multipliers like this:

set @articles_multiplier=3;
set @events_multiplier=2;
set @pages_multiplier=1;

Here's a working example you can try that demonstrates some of these techniques:

Create sample data:

create database d;
use d;

create table articles (id int primary key, content text) ENGINE = MYISAM;
create table events (id int primary key, content text) ENGINE = MYISAM;
create table pages (id int primary key, content text) ENGINE = MYISAM;

insert into articles values 
(1, "Lorem ipsum dolor sit amet"),
(2, "consectetur adipisicing elit"),
(3, "sed do eiusmod tempor incididunt");

insert into events values 
(1, "Ut enim ad minim veniam"),
(2, "quis nostrud exercitation ullamco"),
(3, "laboris nisi ut aliquip");

insert into pages values 
(1, "Duis aute irure dolor in reprehenderit"),
(2, "in voluptate velit esse cillum"),
(3, "dolore eu fugiat nulla pariatur.");

Make it searchable:

ALTER TABLE articles ADD FULLTEXT(content);
ALTER TABLE events ADD FULLTEXT(content);
ALTER TABLE pages ADD FULLTEXT(content);

Use a UNION to search all these tables:

set @target='dolor';

SELECT * from (
  SELECT 
    'articles' as 'table_name', id, 
    @articles_multiplier * (MATCH(content) AGAINST (@target)) as relevance
    from articles
  UNION
  SELECT 
    'events' as 'table_name', 
    id,
    @events_multiplier * (MATCH(content) AGAINST (@target)) as relevance
    from events
  UNION
  SELECT 
    'pages' as 'table_name', 
    id, 
    @pages_multiplier * (MATCH(content) AGAINST (@target)) as relevance
    from pages
)
as sitewide WHERE relevance > 0;

The result:

+------------+----+------------------+
| table_name | id | relevance        |
+------------+----+------------------+
| articles   |  1 | 1.98799377679825 |
| pages      |  3 | 0.65545331108093 |
+------------+----+------------------+
like image 195
joelparkerhenderson Avatar answered Oct 09 '22 01:10

joelparkerhenderson