Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any point in creating a second column optimized for FULLTEXT searches?

the project I'm working on has for each column that needs to be searched a second column called "ft[columnname]" which has a FULLTEXT index and only this one is searched against.

This column contains an "optimized" text, that is automatically generated from the original column in the following way:

  • The string is lowercased
  • All accents are removed
  • All punctuations and unsearchable characters are removed
  • All duplicated words are removed
  • All words are sorted from the longest to the shortest
  • Other transformations that I don't really understand (related to combined-words)

For example "I like Pokémon, especially Pikachu!" becomes "especially pokemon pikachu like i".

Is there any (even a very tiny one) performance benefit? The data in the database never dynamically changes.

like image 443
Thomas Bonini Avatar asked Dec 04 '25 17:12

Thomas Bonini


1 Answers

There might be a functionality benefit for your specific application, but storing the data in duplicate is largely a performance hit -- not a benefit.

Since your data is now twice as big, assuming a sufficiently large data set, only half as much can be held in the various levels of caching (e.g. MySQL, OS), so you're going to be reading from disk much more, which is the normal bottleneck.

Having said that, if you use single-byte character set on the ft indexed column, but a multi-byte character set on the original text, your full text index may be much smaller than it would have been otherwise.

like image 149
Riedsio Avatar answered Dec 06 '25 07:12

Riedsio



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!