Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL full text search with partial words

Tags:

MySQL Full Text searching appears to be great and the best way to search in SQL. However, I seem to be stuck on the fact that it won't search partial words. For instance if I have an article titled "MySQL Tutorial" and search for "MySQL", it won't find it.

Having done some searching I found various references to support for this coming in MySQL 4 (i'm using 5.1.40). I've tried using "MySQL" and "%MySQL%", but neither works (one link I found suggested it was stars but you could only do it at the end or the beginning not both).

Here's my table structure and my query, if someone could tell me where i'm going wrong that would be great. I'm assuming partial word matching is built in somehow.

 CREATE TABLE IF NOT EXISTS `articles` (   `article_id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,   `article_name` varchar(64) NOT NULL,   `article_desc` text NOT NULL,   `article_link` varchar(128) NOT NULL,   `article_hits` int(11) NOT NULL,   `article_user_hits` int(7) unsigned NOT NULL DEFAULT '0',   `article_guest_hits` int(10) unsigned NOT NULL DEFAULT '0',   `article_rating` decimal(4,2) NOT NULL DEFAULT '0.00',   `article_site_id` smallint(5) unsigned NOT NULL DEFAULT '0',   `article_time_added` int(10) unsigned NOT NULL,   `article_discussion_id` smallint(5) unsigned NOT NULL DEFAULT '0',   `article_source_type` varchar(12) NOT NULL,   `article_source_value` varchar(12) NOT NULL,   PRIMARY KEY (`article_id`),   FULLTEXT KEY `article_name` (`article_name`,`article_desc`,`article_link`) ) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=7 ; 
 INSERT INTO `articles` VALUES (1, 'MySQL Tutorial', 'Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.', 'http://www.domain.com/', 6, 3, 1, '1.50', 1, 1269702050, 1, '0', '0'), (2, 'How To Use MySQL Well', 'Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.', 'http://www.domain.com/', 1, 2, 0, '3.00', 1, 1269702050, 1, '0', '0'), (3, 'Optimizing MySQL', 'Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.', 'http://www.domain.com/', 0, 1, 0, '3.00', 1, 1269702050, 1, '0', '0'), (4, '1001 MySQL Tricks', 'Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.', 'http://www.domain.com/', 0, 1, 0, '3.00', 1, 1269702050, 1, '0', '0'), (5, 'MySQL vs. YourSQL', 'Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.', 'http://www.domain.com/', 0, 2, 0, '3.00', 1, 1269702050, 1, '0', '0'), (6, 'MySQL Security', 'Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.', 'http://www.domain.com/', 0, 2, 0, '3.00', 1, 1269702050, 1, '0', '0'); 
 SELECT count(a.article_id) FROM articles a              WHERE MATCH (a.article_name, a.article_desc, a.article_link) AGAINST ('mysql')             GROUP BY a.article_id             ORDER BY a.article_time_added ASC 

The prefix is used as it comes from a function that sometimes adds additional joins.

As you can see a search for MySQL should return a count of 6, but unfortunately it doesn't.

Update

No results where returned as every single row was matched.

http://dev.mysql.com/doc/refman/5.1/en/fulltext-natural-language.html

"The search result is empty because the word “MySQL” is present in at least 50% of the rows. As such, it is effectively treated as a stopword. For large data sets, this is the most desirable behavior: A natural language query should not return every second row from a 1GB table. For small data sets, it may be less desirable."

like image 238
Rob Avatar asked Apr 26 '10 20:04

Rob


1 Answers

My understanding is that MySQL FULLTEXT indexes support searching for prefixes (MATCH (a.article_name) AGAINST ('MySQL*' IN BOOLEAN MODE)) only.

like image 82
Matthew Flynn Avatar answered Sep 20 '22 14:09

Matthew Flynn