Regarding MySQL database and Java JDBC connection encoding.
Database has been converted to utf8mb4 and utf8mb4_unicode_ci as shown this is the results of SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%'; from the JDBC connection.
+--------------------------+--------------------+
| Variable_name | Value |
+--------------------------+--------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| collation_connection | utf8mb4_unicode_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8mb4_unicode_ci |
+--------------------------+--------------------+
From the MySQL Workbench and the terminal connecting directly to the DB I can see the Unicode character í with its correct hex value of c3 ad
+------------------------------+
| HEX(location.name) |
+------------------------------+
| C3AD |
+------------------------------+
JDBC Connection settings:
useUnicode=true&characterEncoding=UTF-8
Using HikariCP with the config of
config.addDataSourceProperty("useUnicode", "true");
config.addDataSourceProperty("characterEncoding", "utf-8");
config.setConnectionInitSql("SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci");
Using mysql-connector-java:8.0.11
From the results of querying the table in question using the JDBC connection the í character is returned as � within Postman. And the 💩 is returned as ? within Postman.
Which leads me to believe that my connection during reading is not UTF-8 according to stackoverflow.com/questions/38363566 how would I detect this?
Database and application have been reset to apply settings, if it was needed.
characterEncoding=utf-8 is incompatible with utf8mb4. Use character_set_server=utf8mb4 in your JDBC URL instead, or config.addDataSourceProperty("character_set_server", "utf8mb4");. Do not use characterEncoding at all.
From the MySQL Connection/J developer guide → Using Character Sets → Setting the Character Encoding:
… to use the 4-byte UTF-8 character set with Connector/J, configure the MySQL server with
character_set_server=utf8mb4, and leavecharacterEncodingout of the Connector/J connection string.
And right below that:
Warning
In order to use the
utf8mb4character set for the connection, the server MUST be configured withcharacter_set_server=utf8mb4; if that is not the case, whenUTF-8is used forcharacterEncodingin the connection string, it will map to the MySQL character set nameutf8, which is an alias forutf8mb3.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With