Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL throws Incorrect string value error

I'm trying to store the following tweet into a longtext column / utf8 charset / MySQL 5.5. database with MyISAM storage on.

We also tried utf8mb4, utf16, utf32 charsets but are unable to get past this issue.

tweet="@Dorable_Dimples: Okay enough of those #IfYouWereMines I'm getting dep ressed. #foreveralone ?" lol yes  mysql> ALTER DATABASE foo CHARACTER SET utf8 COLLATE utf8_bin;  mysql> show variables like 'char%'; +--------------------------+-------------------------------------------+ | Variable_name | Value | +--------------------------+-------------------------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | utf8 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /rdsdbbin/mysql-5.5.12.R1/share/charsets/ |  Incorrect string value: '\xF0\x9F\x98\x94\xE2\x80...' for column 'tweet' at row 1  Unable to store tweet "@Dorable_Dimples: Okay enough of those #IfYouWereM ines I'm getting depressed. #foreveralone ?" lol yes javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCExcept ion: could not insert at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityMana gerImpl.java:1387) at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityMana gerImpl.java:1315) at org.hibernate.ejb.AbstractEntityManagerImpl.convert(AbstractEntityMana gerImpl.java:1321) at org.hibernate.ejb.AbstractEntityManagerImpl.persist(AbstractEntityMana gerImpl.java:843) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462)  at org.hibernate.exception.SQLStateConverter.handledNonSpecificException( SQLStateConverter.java:140) at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.ja va:128) at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelpe r.java:66) at org.hibernate.id.insert.AbstractReturningDelegate.performInsert(Abstra ctReturningDelegate.java:64) at org.hibernate.persister.entity.AbstractEntityPersister.insert(Abstract EntityPersister.java:2345) at org.hibernate.persister.entity.AbstractEntityPersister.insert(Abstract EntityPersister.java:2852) at org.hibernate.action.EntityIdentityInsertAction.execute(EntityIdentity InsertAction.java:71) at org.hibernate.engine.ActionQueue.execute(ActionQueue.java:273) at org.hibernate.event.def.AbstractSaveEventListener.performSaveOrReplica te(AbstractSaveEventListener.java:320) at org.hibernate.event.def.AbstractSaveEventListener.performSave(Abstract SaveEventListener.java:203) at org.hibernate.event.def.AbstractSaveEventListener.saveWithGeneratedId( AbstractSaveEventListener.java:129) at org.hibernate.ejb.event.EJB3PersistEventListener.saveWithGeneratedId(E JB3PersistEventListener.java:69) at org.hibernate.event.def.DefaultPersistEventListener.entityIsTransient( DefaultPersistEventListener.java:179) at org.hibernate.event.def.DefaultPersistEventListener.onPersist(DefaultP ersistEventListener.java:135) at org.hibernate.event.def.DefaultPersistEventListener.onPersist(DefaultP ersistEventListener.java:61) at org.hibernate.impl.SessionImpl.firePersist(SessionImpl.java:808) at org.hibernate.impl.SessionImpl.persist(SessionImpl.java:782) at org.hibernate.impl.SessionImpl.persist(SessionImpl.java:786) at org.hibernate.ejb.AbstractEntityManagerImpl.persist(AbstractEntityMana gerImpl.java:837) ... 5 more Caused by: java.sql.SQLException: Incorrect string value: '\xF0\x9F\x98\x94\xE2\x 80...' for column 'tweet' at row 1 at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1073) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3609) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3541) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2002) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2163) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2624) at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.jav a:2127) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java: 2427) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java: 2345) at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java: 2330) at org.hibernate.id.IdentityGenerator$GetGeneratedKeysDelegate.executeAnd Extract(IdentityGenerator.java:94) at org.hibernate.id.insert.AbstractReturningDelegate.performInsert(Abstra ctReturningDelegate.java:57) 
like image 256
priya Avatar asked Jan 03 '12 09:01

priya


People also ask

What does incorrect string value mean?

That error means that either you have the string with incorrect encoding (e.g. you're trying to enter ISO-8859-1 encoded string into UTF-8 encoded column), or the column does not support the data you're trying to enter.

What is the difference between utf8 and utf8mb4?

The difference between utf8 and utf8mb4 is that the former can only store 3 byte characters, while the latter can store 4 byte characters. In Unicode terms, utf8 can only store characters in the Basic Multilingual Plane, while utf8mb4 can store any Unicode character.


2 Answers

It's the character at the end of the tweet that's causing the problem.

It looks like an 'emoji' character aka japanese smiley face but it's not displaying for me in either Chrome or Safari.

There are known issues storing 4byte utf characters in some versions of MySQL. Apparently you must use utf8mb4 to represent 4 byte UTF characters, as the normal utf8 character set can only represent characters up to 3 bytes in length and so can't store character which are outside of the Basic Multilingual Plane

http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-utf8mb4.html

Which is news to me as it basically means that the utf8 datatype in MySQL isn't really proper utf8.

There are suggestions of how to handle this here How to insert utf-8 mb4 character(emoji in ios5) in mysql? including:

"Also make sure your app layer sets its database connections' character set to utf8mb4. Double-check this is actually happening – if you're running an older version of your chosen framework's mysql client library, it may not have been compiled with utf8mb4 support and it won't set the charset properly. If not, you may have to update it or compile it yourself"

If you're using Connector/J you need to set character_set_server=utf8mb4 in the connection config.

All your character sets should be utf8mb4, which you may have tried but aren't currently set.

like image 104
Danack Avatar answered Sep 21 '22 19:09

Danack


I like Danask57's answer - it's correct and the 'right' way to do it. (I up voted it myself)

However, another quick-and-dirty solution is to change the schema. use a varbinary or binary to store the tweet string:

http://dev.mysql.com/doc/refman/5.0/en/binary-varbinary.html

The upside is that you won't get any character set problems.

The downside is that your string comparison and sorting will be lost, and you won't be able to full text index the column.

Just a suggestion, but this is not the 'right' answer, just a quick and dirty solution that gets things working.

like image 39
FlipMcF Avatar answered Sep 19 '22 19:09

FlipMcF