I'm storing single emojis in a CHAR column in a MySQL database. The column's encoding is utf8mb4.
When I run this aggregate query, MySQL won't group by the emoji characters. It instead returns a single row with a single emoji and the count of all the rows in the database.
SELECT emoji, count(emoji) FROM emoji_counts GROUP BY emoji
Here's my table definition:
CREATE TABLE `emoji_counts` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`emoji` char(1) DEFAULT '',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
Is there some special Unicode behavior I'll have to account for?
Turns out I needed to specify an expanded collation in the query, namely utf8mb4_unicode_520_ci.
This worked:
SELECT emoji, count(emoji) FROM emoji_counts group by emoji collate utf8mb4_unicode_520_ci;
EDIT: That collation isn't available on some server configs (including ClearDB's)... utf8mb4_bin also appears to work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With