I have a table, students
, with 3 columns: id
, name
, and age
.
I have a UNIQUE
index Index_2
on columns name
and age
.
CREATE TABLE `bedrock`.`students` (
`id` INTEGER UNSIGNED NOT NULL
AUTO_INCREMENT, `name` VARCHAR(45)
NOT NULL, `age` INTEGER UNSIGNED NOT
NULL, PRIMARY KEY (`id`), UNIQUE
INDEX `Index_2` USING BTREE(`name`,
`age`) ) ENGINE = InnoDB;
I tried this insert option:
insert into students (id, name, age)
values (1, 'Ane', 23);
which works ok. Than I've tried this one (see Ané - e acute):
insert into students (id, name, age)
values (2, 'Ané', 23);
and I receive this error message:
"Duplicate entry 'Ané-23' for key 'Index_2'"
MySQL somehow does not make any distinction between "Ane" and "Ané". How I can resolve this and why this is happening?
Charset for table students is "utf8" and collation is "utf8_general_ci".
ALTER TABLE `students` CHARACTER SET utf8 COLLATE utf8_general_ci;
Later edit1: @Crozin:
I've changed to use collation utf8_bin:
ALTER TABLE `students`
CHARACTER SET utf8 COLLATE utf8_bin;
but I receive the same error.
But if I create the table from start with charset utf8 and collation utf8_bin, like this:
CREATE TABLE `students2` (
`id` INTEGER UNSIGNED AUTO_INCREMENT,
`name` VARCHAR(45), `age`
VARCHAR(45), PRIMARY KEY (`id`),
UNIQUE INDEX `Index_2` USING
BTREE(`name`, `age`) ) ENGINE = InnoDB
CHARACTER SET utf8 COLLATE utf8_bin;
both below insert commands works ok:
insert into students2 (id, name, age)
values (1, 'Ane', 23); // works ok
insert into students2 (id, name, age)
values (2, 'Ané', 23); // works ok
This seems to be very weird.
Later edit 2:
I saw another answer here. I'm not sure if the user deleted or it get lost. I was just testing it:
The user wrote that first he created 3 tables with 3 different charsets:
CREATE TABLE `utf8_bin` ( `id`
int(10) unsigned NOT NULL
AUTO_INCREMENT, `name` varchar(45)
COLLATE utf8_bin NOT NULL, `age`
int(10) unsigned NOT NULL, PRIMARY
KEY (`id`), UNIQUE KEY `Index_2`
(`name`,`age`) USING BTREE )
ENGINE=InnoDB DEFAULT CHARSET=utf8
COLLATE=utf8_bin;
CREATE TABLE `utf8_unicode_ci` (
`id` int(10) unsigned NOT NULL
AUTO_INCREMENT, `name` varchar(45)
COLLATE utf8_unicode_ci NOT NULL,
`age` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`), UNIQUE KEY
`Index_2` (`name`,`age`) USING BTREE )
ENGINE=InnoDB DEFAULT CHARSET=utf8
COLLATE=utf8_unicode_ci;
CREATE TABLE `utf8_general_ci` (
`id` int(10) unsigned NOT NULL
AUTO_INCREMENT, `name` varchar(45)
COLLATE utf8_general_ci NOT NULL,
`age` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`), UNIQUE KEY
`Index_2` (`name`,`age`) USING BTREE )
ENGINE=InnoDB DEFAULT CHARSET=utf8
COLLATE=utf8_general_ci;
The results of the user are:
Insert commands: INSERT INTO utf8_bin
VALUES (1, 'Ane', 23), (2, 'Ané', 23);
Query OK, 2 rows affected (0.02 sec)
Records: 2 Duplicates: 0 Warnings: 0
INSERT INTO utf8_unicode_ci VALUES (1,
'Ane', 23), (2, 'Ané', 23); Query OK,
2 rows affected (0.01 sec) Records: 2
Duplicates: 0 Warnings: 0
INSERT INTO utf8_general_ci VALUES (1,
'Ane', 23), (2, 'Ané', 23); Query OK,
2 rows affected (0.01 sec) Records: 2
Duplicates: 0 Warnings: 0
Here are my results:
INSERT INTO utf8_bin VALUES (1, 'Ane',
23), (2, 'Ané', 23); //works ok
INSERT INTO utf8_unicode_ci VALUES (1,
'Ane', 23), (2, 'Ané', 23); //
Duplicate entry 'Ané-23' for key
'Index_2'
INSERT INTO utf8_general_ci VALUES (1,
'Ane', 23), (2, 'Ané', 23);
//Duplicate entry 'Ané-23' for key
'Index_2'
I'm not sure why in his part this INSERT
command worked and for me doesn't work.
He also wrote that he tested this on Mysql on Linux - has to do something with this?! Even I do not think so.
utf8_general_ci is a legacy collation that does not support expansions, contractions, or ignorable characters. It can make only one-to-one comparisons between characters. MySQL implements utf8 language-specific collations if the ordering with utf8_unicode_ci does not work well for a language.
Indexing is a process to find an unordered list into an ordered list that allows us to retrieve records faster. It creates an entry for each value that appears in the index columns. It helps in maximizing the query's efficiency while searching on tables in MySQL.
In addition to enforcing the uniqueness of data values, a unique index can also be used to improve data retrieval performance during query processing. Non-unique indexes are not used to enforce constraints on the tables with which they are associated.
When you specify UNIQUE KEY , the column is indexed. So it has no difference in performance with other indexed column (e.g. PRIMARY KEY) of same type.
and collation is "utf8_general_ci".
And that's the answer. If you're using utf8_general_ci
(actually it applies to all utf_..._[ci|cs]
) collation then diacritics are bypassed in comarison, thus:
SELECT "e" = "é" AND "O" = "Ó" AND "ä" = "a"
Results in 1
. Indexes also use collation.
If you want to distinguish between ą
and a
then use utf8_bin
collation (keep in mind that it also distinguish between uppercase and lowercase characters).
By the way name and age don't guarantee any uniqueness.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With