Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mySQL: utf8 charset on index table and duplicate key error

Tags:

mysql

utf-8

I have expected a strange behavior of mySQL when using a varchar field encoded in utf8 as primary key. It fails with duplicate key error for strings which aren't equal in my development environment.

A short example:

SET NAMES 'utf8';

CREATE TABLE `test` (
  `id` varchar(5) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `test` (`id`) VALUES ('das'), ('daß');

Fails with error: Duplicate entry 'daß' for key 'PRIMARY'.

I am running mySQL 5.5.35 on ubuntu 13.10 with default configuration.

On another mySQL Server (version 5.0.95) the same queries did not fail. Is this because of mySQL version or is there a configuration option to set encoding of index tables?

I run into this issue while trying to import a mySQL dump from productive server into my development environment.

like image 713
jelhan Avatar asked Feb 23 '14 15:02

jelhan


1 Answers

You should use the collation utf8_unicode_ci when you use German characters, according to the discussion in this bug: Bug #39816 German collation under utf8_unicode_ci is incorrect.

Despite the title of that bug, I just tested this on 5.6.15, and your test case works, whereas the default unicode collation does not work:

CREATE TABLE `test` (
  `id` varchar(5) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 DEFAULT COLLATE=utf8_unicode_ci;

INSERT INTO `test` (`id`) VALUES ('das'), ('daß');

PS: I recommend that you use a development environment the same versions of all software as your production environment, or at least share the same major version. You're bound to run into other incompatibilities if you develop in 5.5 and then try to deploy to 5.0.

like image 110
Bill Karwin Avatar answered Sep 23 '22 21:09

Bill Karwin