Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient implementation of faceted search in relational databases

I am trying to implement a Faceted search or tagging with multiple-tag filtering. In the faceted navigation, only not-empty categories are displayed and the number of items in the category that are also matching already applied criteria is presented in parenthesis.

I can get all items having assigned categories using INNER JOINs and get number of items in all category using COUNT and GROUP BY, however I'm not sure how it will scale to millions of objects and thousands of tags. Especially the counting.

I know that there are some not-relational solutions like Lucene + SOLR, but I've found also some closed-source RDBMS-based implementations that are said to be entreprise-strength like FacetMap.com or Endeca software, so there must be an efficient way to perform faceted search in relational databases.

Does anybody have experience in faceted search and could give some tips?

Cache the counts for each category set? Maybe use some smart incremental technique that will update the counters?

Edit:

An example of faceted navigation can be found here: Flamenco.

Currently I have the standard 3-table scheme (items, tags and items_tags like described here: http://www.pui.ch/phred/archives/2005/04/tags-database-schemas.html#toxi ) plus a table for facets. Each tag has assigned a facet.

like image 296
Adam Dziendziel Avatar asked Dec 04 '09 16:12

Adam Dziendziel


People also ask

What is a database faceted search?

Faceted search, or faceted navigation, is a way of browsing and searching for items in a set of data by applying filters on various properties (facets) of the items in the collection.

Which are advantages of faceted browsing over hierarchical search?

Faceted search allows users to reduce the number of search results quickly to get to the desired item(s). Showing narrowing options (facets) is easier for users because they don't have to know the syntax necessary to specify their search precisely.

What is faceted search example?

Here is an example of a faceted filter: A user is looking for an iPhone on eBay. What eBay offers is a logical filter with different facet types. The visitor can choose which storage capacity they want, which iPhone model and which color they prefer.


2 Answers

I can only confirm what Nils says. RDBMS are not good for multi-dimensional searching. I have worked with some smart solutions, caching counters, using triggers, and so on. But in the end, external dedicated indexer always wins.

MAYBE, if you transform your data into dimensional model and feed it to some OLAP [I mean MDX engine] - it will perform well. But it seems a bit too heavy solution, and it will be definitely NOT real-time.

On the contrary, solution with dedicated indexing engine (think Lucene, think Sphinx) can be made near-real time with incremental index updates.

like image 139
filiprem Avatar answered Sep 23 '22 03:09

filiprem


IMO, relational databases aren't that good at searching. You would get better performance from a dedicated search engine (like Solr/Lucene).

like image 29
Nils Weinander Avatar answered Sep 22 '22 03:09

Nils Weinander