Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQLAlchemy Unicode Problems in Exceptions

I'm working on a Flask app with a postgres/SQLAlchemy/Flask-Admin. However, in the Admin interface, any DB error that contain Unicode letters can't be reported since unicode(exc) raises UnicodeDecodeError.

I was able to locate that problem to sqlalchemy.exc

class StatementError(SQLAlchemyError):
    ...
    def __unicode__(self):
        return self.__str__()

And reproduce the problem by with:

class A(Base):
    __tablename__="a"
    id = Column(Integer, primary_key=True)
    name = Column(String)
    name2 = Column(String, nullable=False)

session = Session()
a = A(name=u"עברית")
session.add(a)

try:
    session.commit()
except Exception as e:
    print(repr(e))
    print("------------------")
    print(unicode(e))

Which returns:

ProgrammingError('(psycopg2.ProgrammingError) column "name" of relation "a" does not exist\nLINE 1: INSERT INTO a (name, name2) VALUES (\'\xd7\xa2\xd7\x91\xd7\xa8\xd7\x99\xd7\xaa\', NULL) RETURNING...\n                       ^\n',)
------------------
Traceback (most recent call last):
  File "test.py", line 27, in <module>
    print(unicode(e))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd7 in position 118: ordinal not in range(128)

And I currently solve it by replacing the relevant exceptions with my classes that decode from utf-8. However, this is a terrible hack, and I'm looking for a proper solution:

  • Is there way to configure SQLAlchemy to automatically decode the received error messages?
  • Is there way to configure Postgres to output messages in latin encoding (less favorable, but accetable)
  • Is there way to make unicode try to decode by utf-8 instead of ascii/latin?
  • Is there any way to resolve it at all???

(The problem is relevant only to Python2. In Python3 the code above works. I believe it's because the default encoding is utf-8)

like image 540
tmrlvi Avatar asked Apr 23 '17 02:04

tmrlvi


1 Answers

I actually think that patching SQLAlchemy from your application is the right reasonably clean solution. Here's why:

  • You've identified something that generally is agreed to be a bug in SQLAlchemy.

  • You can write a patch that will behave the same for all situations that SQLAlchemy currently works with. That is, your patch will not break existing code

  • The probability is very high that even if SQLAlchemy is fixed your patch will be harmless.

  • Making this change reduces the impact of the SQLAlchemy bug throughout your code over solutions like changing every place where exceptions might be printed.

  • Changing PostGres to return latin1 encoding actually wouldn't help because python is using the ascii encoding, which would give the same error when given a latin1 string. Also, changing PostGres to return latin1 errors would probably involve changing the connection encoding; that likely creates issues for unicode data.

Here's a simple program that patches sqlalchemy.exc.StatementError and tests the patch. If you wanted you could even try generating a exception including unicode, convert that to unicode, and only apply the patch if that raises UnicodeDecodeError. If you did that, your patch would automatically stop being applied when sqlalchemy fixes the issue.

# -*- coding: utf-8 -*-
from sqlalchemy.exc import StatementError

def statement_error_unicode(self):
    return unicode(str(self), 'utf-8')
# See <link to sqlalchemy issue>; can be removed once we require a
# version of sqlalchemy with a fix to that issue
StatementError.__unicode__ = statement_error_unicode

message = u'Sqlalchemy unicode 😞'
message_str = message.encode('utf-8')
error = StatementError(message_str, 'select * from users', tuple(), '')
print unicode(error)
like image 125
Sam Hartman Avatar answered Nov 13 '22 08:11

Sam Hartman