Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to save unicode with SQLAlchemy?

I've encountered such error:

File "/vagrant/env/local/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 435, in do_execute
            cursor.execute(statement, parameters)
        exceptions.UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 8410: ordinal not in range(128)

It happens when Im trying to save ORM object with assigned Python's unicode string. And as a result dict parameters has a unicode string as one of its values and it produces the error while coercing it to str type.

I've tried to set convert_unicode=True setting on engine and column, but without success.

So what is a good way to handle unicode in SQLAlchemy?

UPDATE

This is some details about my setup:

Table:

                                    Table "public.documents"
   Column   |           Type           |                       Modifiers                        
------------+--------------------------+--------------------------------------------------------
 id         | integer                  | not null default nextval('documents_id_seq'::regclass)
 sha256     | text                     | not null
 url        | text                     | 
 source     | text                     | not null
 downloaded | timestamp with time zone | not null
 tags       | json                     | not null
Indexes:
    "documents_pkey" PRIMARY KEY, btree (id)
    "documents_sha256_key" UNIQUE CONSTRAINT, btree (sha256)

ORM model:

class Document(Base):
    __tablename__ = 'documents'

    id = Column(INTEGER, primary_key=True)
    sha256 = Column(TEXT(convert_unicode=True), nullable=False, unique=True)
    url = Column(TEXT(convert_unicode=True))
    source = Column(TEXT(convert_unicode=True), nullable=False)
    downloaded = Column(DateTime(timezone=True), nullable=False)
    tags = Column(JSON, nullable=False)

SQLAlchemy settngs:

ENGINE = create_engine('postgresql://me:secret@localhost/my_db',
                       encoding='utf8', convert_unicode=True)
Session = sessionmaker(bind=ENGINE)

And the code that produces the error is just creaes a session, instantiates a Document object and saves it with the sourcefieldwithunicode` strign assigned to it.

UPDATE #2

Check this repo - it has automated Vagrant/Ansible setup, and it reproduces this bug.

like image 426
Gill Bates Avatar asked Jul 17 '14 05:07

Gill Bates


1 Answers

Your problem is here:

$ sudo grep client_encoding /etc/postgresql/9.3/main/postgresql.conf
client_encoding            = sql_ascii

That causes psycopg2 to default to ASCII:

>>> import psycopg2
>>> psycopg2.connect('dbname=dev_db user=dev').encoding
'SQLASCII'

... which effectively shuts off psycopg2's ability to handle Unicode.

You can either fix this in postgresql.conf:

client_encoding = utf8

(and then sudo invoke-rc.d postgresql reload), or you can specify the encoding explicitly when you create the engine:

self._conn = create_engine(src, client_encoding='utf8')

I recommend the former, because the early nineties are long gone. : )

like image 198
Gunnlaugur Briem Avatar answered Sep 27 '22 21:09

Gunnlaugur Briem