For some strange reason I can't seem to add UTF-8 data to my MySQL database. When I enter a non-latin character, it's stored as ?????. Everything else is stored fine. So for example, "this is an example®™" is stored fine, but "和英辞典" is stored as "????".
The connection url is fine:
private DataSource getDB() throws PropertyVetoException {
ComboPooledDataSource db = new ComboPooledDataSource();
db.setDriverClass("com.mysql.jdbc.Driver");
db.setJdbcUrl("jdbc:mysql://domain.com:3306/db?useUnicode=true&characterEncoding=UTF-8");
db.setUser("...");
db.setPassword("...");
return db;
}
I'm using PreparedStatement as you would expect, I even tried entering "set names utf8" as someone suggested.
Connection conn = null;
PreparedStatement stmt = null;
ResultSet rs = null;
try {
conn = db.getConnection();
stmt = conn.prepareStatement("set names utf8");
stmt.execute();
stmt = conn.prepareStatement("set character set utf8");
stmt.execute();
... set title...
stmt = conn.prepareStatement("INSERT INTO Table (title) VALUES (?)");
stmt.setString(1,title);
stmt.execute();
} catch (final SQLException e) {
...
The table itself seems to be fine.
Default Character Set: utf8
Default Collation: utf8_general_ci
...
Field title:
Type text
Character Set: utf8
Collation: utf8_unicode_ci
I tested it by entering in Unicode ("和英辞典" specifically) through a GUI editor and then selecting from the table -- and it was returned just fine. So this seems to be an issue with JDBC.
What am I missing?
Similarly, here's the command to change character set of MySQL table from latin1 to UTF8. Replace table_name with your database table name. mysql> ALTER TABLE table_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci; Hopefully, the above tutorial will help you change database character set to utf8mb4 (UTF-8).
The difference between utf8 and utf8mb4 is that the former can only store 3 byte characters, while the latter can store 4 byte characters. In Unicode terms, utf8 can only store characters in the Basic Multilingual Plane, while utf8mb4 can store any Unicode character.
UTF-8 encodes a character into a binary string of one, two, three, or four bytes. UTF-16 encodes a Unicode character into a string of either two or four bytes. This distinction is evident from their names. In UTF-8, the smallest binary representation of a character is one byte, or eight bits.
On your JDBC connection string, you just need set the charset encoding like this:
jdbc:mysql://localhost:3306/dbname?characterEncoding=utf8
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With