Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mysql match against not return case insensitive results

I have two tables:

CREATE TABLE IF NOT EXISTS `test1` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `bucket_id` int(10) unsigned NOT NULL COMMENT 'folder this component belongs to',
  `test1_name` varchar(81) NOT NULL COMMENT 'Name of this component',
  `test1_desc` varchar(1024) NOT NULL COMMENT 'Component Description',
  PRIMARY KEY (`id`),
  FULLTEXT KEY `test1_search` (`test1_name`,`test1_desc`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=3 ;

CREATE TABLE IF NOT EXISTS `bucket` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `bkt_name` varchar(81) NOT NULL COMMENT 'The name of this bucket',
  `bkt_desc` varchar(1024) NOT NULL COMMENT 'A description of this bucket',
  `bkt_keywords` varchar(512) DEFAULT NULL COMMENT 'keywords for searches',
  PRIMARY KEY (`id`),
  FULLTEXT KEY `fldr_search` (`bkt_desc`,`bkt_keywords`,`bkt_name`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=8 ;

Bucket is just a holder while test1 contains all the things that would go into a bucket. For example:

INSERT INTO `bucket` (`id`, `bkt_name`, `bkt_desc`, `bkt_keywords`) VALUES
(1, 'Simpsons', 'The Simpsons Cartoon Family was first successful adult cartoon series', 'Homer, Marge, Lisa and Bart'),
(2, 'Griffins', 'The family from the popular family guy series', 'Peter, Lois, Meg, Chris, Stewie, Brian');

INSERT INTO `test1` (`id`, `bucket_id`, `bkt_name`, `bkt_desc`) VALUES
(1, 1, 'Homer Simpson', 'Homer the figurative head of the Simpsons Family and is the husband of Marge'),
(2, 2, 'Peter Griffin', 'Peter the figurative head of the Griffin family on the hit TV seriers The family Guy');

Now, using the following query I want to look for all buckets whose name, description or keywords contain the search term "family" or whose components contain the words "family")

So far, what I have is this query and it's not returning mixed case results as in "Family" is not found while "family" is.

SELECT *
FROM bucket
RIGHT JOIN test1 ON test1.bucket_id = bucket.id
WHERE
  bucket.isvisible > 0 AND
  MATCH(bucket.bkt_keywords, bucket.bkt_desc, bucket.bkt_name)
    AGAINST('family' IN BOOLEAN MODE) OR
  MATCH(test1.test1_name, test1.test1_desc) 
    AGAINST('family' IN BOOLEAN MODE)

I should also add that all text fields have the collation of utf8_general_ci as does the entire table which is MyISAM.

like image 717
ppetree Avatar asked Feb 04 '13 20:02

ppetree


2 Answers

I think your tables do not use utf8_general_ci as collation, but utf8_bin. I was able to reproduce the behaviour you describe after modifying the tables as follows:

ALTER TABLE test1 CONVERT TO CHARACTER SET utf8 COLLATE utf8_bin;
ALTER TABLE bucket CONVERT TO CHARACTER SET utf8 COLLATE utf8_bin;

You should perhaps set your tables' collation explicitely to:

ALTER TABLE test1 CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE bucket CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;

If the above changes anything, I would guess your server or session is actually set to use another collation by default (since the collation is not specified in your tables definition). This could be checked with:

SHOW GLOBAL VARIABLES LIKE 'collation_server';
SHOW SESSION VARIABLES LIKE 'collation_server';
like image 102
RandomSeed Avatar answered Oct 30 '22 12:10

RandomSeed


The answer is apparently adding some parens around the two match against clauses.

SELECT *
FROM bucket
RIGHT JOIN test1 ON test1.bucket_id = bucket.id
WHERE bucket.isvisible > 0 AND
 ( MATCH(bucket.bkt_keywords, bucket.bkt_desc, bucket.bkt_name)
  AGAINST('family' IN BOOLEAN MODE) OR
  MATCH(test1.test1_name, test1.test1_desc) 
  AGAINST('family' IN BOOLEAN MODE) )
like image 2
ppetree Avatar answered Oct 30 '22 11:10

ppetree