I'm writing Python 2 code with Unicode strings, importing unicode_literals and I'm having issues with raising exceptions.
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
raise Exception('Tést')
When doing this, the 'Tést' string is stripped off the terminal.
I can workaround this with
raise Exception('Tést'.encode('utf-8'))
I'd rather find a global solution than having to do this in all raise Exception
statements.
(Since I'm using PyQt's tr()
function in Exception messages, special characters must be handled, I can't know at coding time whether encode('utf-8')
is necessary.)
Worse. Sometimes, I want to catch an Exception, get its message, and raise a new Exception, concatenating a base string with the first Exception string.
I have to do it this way:
try:
raise TypeError('Tést'.encode('utf-8'))
except Exception as e:
raise Exception('Exception: {}'.format(str(e).decode('utf-8')).encode('utf-8'))
but I really wish it could be less cumbersome (and this example doesn't even include the self.tr()
calls).
Is there any simpler way ?
(And as a side question, are things simpler with Python3 ? Can Exception use unicode strings ?)
The first step toward solving your Unicode problem is to stop thinking of type< 'str'> as storing strings (that is, sequences of human-readable characters, a.k.a. text). Instead, start thinking of type< 'str'> as a container for bytes.
To allow working with Unicode characters, Python 2 has a unicode type which is a collection of Unicode code points (like Python 3's str type). The line ustring = u'A unicode \u018e string \xf1' creates a Unicode string with 20 characters.
In python, to remove Unicode character from string python we need to encode the string by using str. encode() for removing the Unicode characters from the string.
Unicode is a standard encoding system that is used to represent characters from almost all languages. Every Unicode character is encoded using a unique integer code point between 0 and 0x10FFFF . A Unicode string is a sequence of zero or more code points.
Thanks to the comments below the question, I came up with this.
The idea is to use a custom Exception subclass.
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
class MyException(Exception):
def __init__(self, message):
if isinstance(message, unicode):
super(MyException, self).__init__(message.encode('utf-8'))
self.message = message
elif isinstance(message, str):
super(MyException, self).__init__(message)
self.message = message.decode('utf-8')
# This shouldn't happen...
else:
raise TypeError
def __unicode__(self):
return self.message
class MySubException(MyException):
pass
try:
raise MyException('Tést')
except MyException as e:
print(e.message)
raise MySubException('SubException: {}'.format(e))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With