Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to quickly search book titles?

I have a database of about 200k books. I wish to give my users a way to quickly search a book by the title. Now, some titles might have prefix like A, THE, etc. and also can have numbers in the title, so search for 12 should match books with "12", "twelve" and "dozen" in the title. This will work via AJAX, so I need to make sure database query is really fast.

I assume that most of the users will try to search using some words of the title, so I'm thinking to split all the titles into words and create a separate database table which would map words to titles. However, I fear this might not give the best results. For example, the book title could be some 2 or 3 commonly used words, and I might get a list of books with longer titles that contain all 2-3 words and the one I'm looking for lost like a needle in a haystack. Also, searching for a book with many words in the title might slow down the query because of a lot of OR clauses.

Basically, I'm looking for a way to:

  • find the results quickly
  • sort them by relevance.

I assume this is not the first time someone needs something like this, and I'd hate to reinvent the wheel.

P.S. I'm currently using MySQL, but I could switch to anything else if needed.

like image 990
Milan Babuškov Avatar asked May 08 '11 15:05

Milan Babuškov


People also ask

How do you find a book if you forgot the title?

Google Books Search You can quickly view sample pages to ensure that the book you've found is the book you've been looking for. There are several other book databases you can find as well, such as Book Finder, Word Cat, LibraryThing, Book Sleuth, Goodreads, Amazon's Advanced Book Search, and Library of Congress.

How do I find a title of a book?

If you can remember just one word, use the search function on Goodreads or Library Thing to find long lists of titles with a particular word. Goodreads' browse-able lists of titles that readers have shelved in unique categories, such as authors' professions or decades of publication, is also be helpful.


1 Answers

Using a SOUNDEX is the best way i think.

SELECT
  id,
  title
FROM products AS p
WHERE p.title SOUNDS LIKE 'Shaw'

// This will match 'Saw' etc.

For best database performances you can best calculate the SOUNDEX value of your titles and put this in a new column. You can calculate the soundex with SOUNDEX('Hello').

Example usage:

UPDATE `books` SET `soundex_title` = SOUNDEX(title);
like image 102
Tom Claus Avatar answered Sep 24 '22 02:09

Tom Claus