In a MySQL database, a table is encoded in utf8, but for some reason the connection is in latin1.
res <- RMySQL::dbSendQuery(con,"show variables like 'character_set_%'")
dbFetch(res)
Variable_name Value
1 character_set_client latin1
2 character_set_connection latin1
3 character_set_database utf8mb4
4 character_set_filesystem binary
5 character_set_results latin1
6 character_set_server latin1
7 character_set_system utf8
8 character_sets_dir /usr/share/mysql/charsets/
This page explain how to set the connection's character set to utf8 using RMySQL.
RMySQL::dbGetQuery(con,"show variables like 'character_set_%'")
RMySQL::dbGetQuery(con,"set names utf8")
But I actually prefer to use the dplyr::tbl
to query the database. Since the connection created by dplyr::src_mysql
only has the possibility to send sql statements that create tables. What is the dplyr way to set the connection setting to use utf8 encoding?
I ran into the same problem, which I solved as follows:
foo_db <- src_mysql(host='0.0.0.0',user='dbuser',password='a_password',
dbname='FlightTimes',port=3336)
dbGetQuery(foo_db$con,'SET NAMES utf8')
I found this was possible by looking at the structure of foo_db
via str(foo_db)
, seeing there was a con
attribute of class MySQLConnection
, then applying your dbGetQuery
mantra.
Edit the server option file (located in /etc/mysql/my.cnf on a Debian system) and add the following options:
collation-server = utf8_unicode_ci
character-set-server = utf8
skip-character-set-client-handshake
The mysql server configuration file can also be edited with mysql-workbench.
After this change, dplyr::tbl
fetches character vector encoded in utf-8.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With