I am writing a test for a database which has Swedish characters in it. In the test, i directly use characters with umlauts and other such Swedish ligatures and it runs just fine, reading filenames in from a database and doing string compares successfully.
However, upon importing this file to do pydoc generation, i get the all-too-familiar exception:
SyntaxError: Non-ASCII character '\xc3' in file foo.py on line 1, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details
Upon doing some investigation on my own, i found that adding
# -*- coding: iso-8859-15 -*-
to the top of my file fixed the importing problem. However, now the test fails all of the string comparisons. I tried the alternate method of forgoing the coding declaration and writing the strings as
u"Bokmärken"
... but this still doesn't keep the test from failing.
Does anyone know a good way to fix this?
You need to set your encoding in your editor and the database so that they match. If your database is utf-8 encoded, and not iso-8859-15, then setting your editor to utf-8 should fix it. However, since your u'string' comparisons fail, this might not be the case.
Replace
# -*- coding: iso-8859-15 -*-
with
# -*- coding: utf-8 -*-
or (the equivalent)
# coding=utf-8
To try utf-8 encoding.
Printing debugging output with repr('swedish string'
and repr(u'swedish string')
will also be useful in inspecting differences.
Right after your interpreter line. Can you tell us what encoding your database is set to? Additionally, was the database data written by python or inserted directly? You could have written data in the wrong encoding to the database to begin with, which is now causing problems on comparison.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With