I'm developing a fairly straightforward web app using Flask and MySQL.
I'm struggling with unicode. Users sometimes paste stuff that they copied from Word and it's falling over with the old smart quotes u'\u201c'
.
A little bit of investigation shows that the connection I have to MySQL is using the Latin1
charset (seems to be the default).
How can I specify for it to use unicode for its connection?
I'm using pyMySQL, which purports to be a drop-in replacement for MySQLdb. MySQLdb defines a set_character_set(self, charset)
function for connection
objects, but pyMySQL doesn't (I get an error if I try).
The MySQL server has a compiled-in default character set and collation. To change these defaults, use the --character-set-server and --collation-server options when you start the server.
connect() supports the following arguments: host , user , password , database , port , unix_socket , client_flags , ssl_ca , ssl_cert , ssl_key , ssl_verify_cert , compress .
From MySQL 8.0, utf8mb4 is the default character set, and the default collation for utf8mb4 is utf8mb4_0900_ai_ci.
PyMySQL is an interface for connecting to a MySQL database server from Python. It implements the Python Database API v2. 0 and contains a pure-Python MySQL client library. The goal of PyMySQL is to be a drop-in replacement for MySQLdb.
I worked it out by poking around in the pyMySQL source (I had tried, but couldn't find the right place!).
You can specify it when you create the connection:
conn = pymysql.connect(host='localhost',
user='username',
passwd='password',
db='database',
charset='utf8')
Solved my problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With