Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ConfigParser with Unicode items

Tags:

my troubles with ConfigParser continue. It seems it doesn't support Unicode very well. The config file is indeed saved as UTF-8, but when ConfigParser reads it it seems to be encoded into something else. I assumed it was latin-1 and I thougt overriding optionxform could help:

-- configfile.cfg --  [rules] Häjsan = 3 ☃ = my snowman  -- myapp.py -- # -*- coding: utf-8 -*-   import ConfigParser  def _optionxform(s):     try:         newstr = s.decode('latin-1')         newstr = newstr.encode('utf-8')         return newstr     except Exception, e:         print e  cfg = ConfigParser.ConfigParser() cfg.optionxform = _optionxform     cfg.read("myconfig")  

Of course, when I read the config I get:

'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128) 

I've tried a couple of different variations of decoding 's' but the point seems moot, since it really should be a unicode object from the beginning. After all, the config file is UTF-8? I have confirmed that's something is wrong in the way ConfigParser reads the file by stubbing it out with this DummyConfig class. If I use that then everything is nice unicode, fine and dandy.

-- config.py -- # -*- coding: utf-8 -*-                 apa = {'rules': [(u'Häjsan', 3), (u'☃', u'my snowman')]}  class DummyConfig(object):     def sections(self):         return apa.keys()     def items(self, section):        return apa[section]     def add_section(self, apa):         pass       def set(self, *args):         pass   

Any ideas what could be causing this or suggestions of other config modules that supports Unicode better are most welcome. I don't want to use sys.setdefaultencoding()!

like image 441
pojo Avatar asked Oct 30 '09 07:10

pojo


People also ask

What is Configparser Configparser ()?

ConfigParser is a Python class which implements a basic configuration language for Python programs. It provides a structure similar to Microsoft Windows INI files. ConfigParser allows to write Python programs which can be customized by end users easily.

How do I print a Configparser object?

Just use a StringIO object and the configparser's write method. It looks like the only method for "printing" the contents of a config object is ConfigParser. write which takes a file-like object. io.

Does Configparser come with Python?

configparser comes from Python 3 and as such it works well with Unicode.


2 Answers

The ConfigParser.readfp() method can take a file object, have you tried opening the file object with the correct encoding using the codecs module before sending it to ConfigParser like below:

cfg.readfp(codecs.open("myconfig", "r", "utf8")) 

For Python 3.2 or above, readfp() is deprecated. Use read_file() instead.

like image 124
Tendayi Mawushe Avatar answered Sep 28 '22 10:09

Tendayi Mawushe


In python 3.2 encoding parameter was introduced to read(), so it can now be used as:

cfg.read("myconfig", encoding='utf-8') 
like image 42
Krzysztof Słowiński Avatar answered Sep 28 '22 08:09

Krzysztof Słowiński