Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pyMySQL set connection character set

I'm developing a fairly straightforward web app using Flask and MySQL.

I'm struggling with unicode. Users sometimes paste stuff that they copied from Word and it's falling over with the old smart quotes u'\u201c'.

A little bit of investigation shows that the connection I have to MySQL is using the Latin1 charset (seems to be the default).

How can I specify for it to use unicode for its connection?

I'm using pyMySQL, which purports to be a drop-in replacement for MySQLdb. MySQLdb defines a set_character_set(self, charset) function for connection objects, but pyMySQL doesn't (I get an error if I try).

like image 295
MalphasWats Avatar asked Jun 18 '12 12:06

MalphasWats


People also ask

How do I change the character set connection in MySQL?

The MySQL server has a compiled-in default character set and collation. To change these defaults, use the --character-set-server and --collation-server options when you start the server.

What are the arguments of MySQL _ connect () function?

connect() supports the following arguments: host , user , password , database , port , unix_socket , client_flags , ssl_ca , ssl_cert , ssl_key , ssl_verify_cert , compress .

What is the default collation for MySQL 8?

From MySQL 8.0, utf8mb4 is the default character set, and the default collation for utf8mb4 is utf8mb4_0900_ai_ci.

What is PyMySQL in Python?

PyMySQL is an interface for connecting to a MySQL database server from Python. It implements the Python Database API v2. 0 and contains a pure-Python MySQL client library. The goal of PyMySQL is to be a drop-in replacement for MySQLdb.


1 Answers

I worked it out by poking around in the pyMySQL source (I had tried, but couldn't find the right place!).

You can specify it when you create the connection:

conn = pymysql.connect(host='localhost',
                       user='username',
                       passwd='password',
                       db='database',
                       charset='utf8')

Solved my problem.

like image 162
MalphasWats Avatar answered Sep 18 '22 15:09

MalphasWats