For an airport input field with autocompletion, there currently is one table with airport descriptions, autocomplete_airport
:
lang | description (with INDEX) | ...
-----+----------------------------------------------------+----
pt | New York - John F Kennedy (JFK), Estados Unidos | ...
pt | Nova Iorque - John F Kennedy (JFK), Estados Unidos | ...
...
Autocompletion works on individual words. So when the user enters "yor", then "new york" shows up (if in LIMIT). The query currently works as follows:
SELECT * FROM autocomplete_airport WHERE lang = "pt"
AND (description LIKE "%(yor)%"
OR description LIKE "yor%"
OR description LIKE "% yor%")
ORDER BY description
LIMIT 15
Now I wonder how to speed up things. One idea is to create the following database structure, with tables autocomplete_airport
and autocomplete_airport_word
:
id | lang | description (with INDEX) | ...
-----+------+----------------------------------------------------+----
123 | pt | New York - John F Kennedy (JFK), Estados Unidos | ...
124 | pt | Nova Iorque - John F Kennedy (JFK), Estados Unidos | ...
...
word (with INDEX) | autocomplete_airport_id
------------------+------------------------
New | 123
York | 123
John | 123
F | 123
Kennedy | 123
JFK | 123
...
Then SELECT would only need to search at the beginning of strings:
SELECT DISTINCT autocomplete_airport.*
FROM autocomplete_airport
INNER JOIN autocomplete_airport_word
ON autocomplete_airport.id = autocomplete_airport_word.autocomplete_airport_id
WHERE lang = "pt"
AND word LIKE "yor%"
ORDER BY description
LIMIT 15
Is that new structure worth the trouble? Would it really speed up things? Is there a simpler way?
Just noticed that the word table has a flaw. The consequence: Searching for "New York" would not give any result. What should work:
term (with INDEX) | autocomplete_airport_id
------------------------------------------------+------------------------
New York - John F Kennedy (JFK), Estados Unidos | 123
York - John F Kennedy (JFK), Estados Unidos | 123
John F Kennedy (JFK), Estados Unidos | 123
F Kennedy (JFK), Estados Unidos | 123
Kennedy (JFK), Estados Unidos | 123
(JFK), Estados Unidos | 123
Estados Unidos | 123
Unidos | 123
JFK | 123
As MartinK says, if your table has just a few hundred rows, your query should be pretty quick even without optimization - it's worth checking what's going on.
However, the best way to search text fields is to use full text indexing (http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html) - this is designed for exactly the case you're describing.
Your proposed approach may speed up your query. The important thing about LIKE queries is that the wildcard % must not be at the start of the pattern.
LIKE '%foobar'
cannot make use of an index.
LIKE 'foobar%'
can make use of an index.
However if you just have a few hundreds airports to store I'd think twice if a full table scan really hurts that much.
Use EXPLAIN select {rest of query}
to find out how and if the DB is using an.
See http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html (Scroll down to B-Tree Index Characteristics, this is the default mysql index type)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With