I have created a package using the encoding utf-8. When calling a function, it returns a <code>DataFrame</code>, with a column coded in utf-8. When using IPython at the command line, I don't have any problems showing the content of this table. When using the Notebook, it crashes with the error <code>'utf8' codec can't decode byte 0xe7</code>. I've attached a full traceback below. What is the proper encoding to work with Notebook? <pre class="prettyprint"><code>UnicodeDecodeError Traceback (most recent call last) <ipython-input-13-92c0011919e7> in <module>() 3 ver = verif.VerificacaoNA() 4 comp, total = ver.executarCompRealFisica(DT_INI, DT_FIN) ----> 5 comp c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\core\displayhook.pyc in __call__(self, result) 240 self.update_user_ns(result) 241 self.log_output(format_dict) --> 242 self.finish_displayhook() 243 244 def flush(self): c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\zmq\displayhook.pyc in finish_displayhook(self) 59 sys.stdout.flush() 60 sys.stderr.flush() ---> 61 self.session.send(self.pub_socket, self.msg, ident=self.topic) 62 self.msg = None 63 c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\zmq\session.pyc in send(self, stream, msg_or_type, content, parent, ident, buffers, subheader, track, header) 557 558 buffers = [] if buffers is None else buffers --> 559 to_send = self.serialize(msg, ident) 560 flag = 0 561 if buffers: c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\zmq\session.pyc in serialize(self, msg, ident) 461 content = self.none 462 elif isinstance(content, dict): --> 463 content = self.pack(content) 464 elif isinstance(content, bytes): 465 # content is already packed, as in a relayed message c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\zmq\session.pyc in <lambda>(obj) 76 77 # ISO8601-ify datetime objects ---> 78 json_packer = lambda obj: jsonapi.dumps(obj, default=date_default) 79 json_unpacker = lambda s: extract_dates(jsonapi.loads(s)) 80 c:\Python27-32\lib\site-packages\pyzmq-13.0.0-py2.7-win32.egg\zmq\utils\jsonapi.pyc in dumps(o, **kwargs) 70 kwargs['separators'] = (',', ':') 71 ---> 72 return _squash_unicode(jsonmod.dumps(o, **kwargs)) 73 74 def loads(s, **kwargs): c:\Python27-32\lib\json\__init__.pyc in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, encoding, default, **kw) 236 check_circular=check_circular, allow_nan=allow_nan, indent=indent, 237 separators=separators, encoding=encoding, default=default, --> 238 **kw).encode(obj) 239 240 c:\Python27-32\lib\json\encoder.pyc in encode(self, o) 199 # exceptions aren't as detailed. The list call should be roughly 200 # equivalent to the PySequence_Fast that ''.join() would do. --> 201 chunks = self.iterencode(o, _one_shot=True) 202 if not isinstance(chunks, (list, tuple)): 203 chunks = list(chunks) c:\Python27-32\lib\json\encoder.pyc in iterencode(self, o, _one_shot) 262 self.key_separator, self.item_separator, self.sort_keys, 263 self.skipkeys, _one_shot) --> 264 return _iterencode(o, 0) 265 266 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr, UnicodeDecodeError: 'utf8' codec can't decode byte 0xe7 in position 199: invalid continuation byte </code></pre>

I had the same problem recently, and indeed setting the default encoding to UTF-8 did the trick: <pre class="prettyprint"><code>import sys reload(sys) sys.setdefaultencoding("utf-8") </code></pre> Running <code>sys.getdefaultencoding()</code> yielded <code>'ascii'</code> on my environment (Python 2.7.3), so I guess that's the default. Also see this related question and Ian Bicking's blog post on the subject.

IPython Notebook: What is the default encoding?

Tags:

I have created a package using the encoding utf-8.

When calling a function, it returns a DataFrame, with a column coded in utf-8.

When using IPython at the command line, I don't have any problems showing the content of this table. When using the Notebook, it crashes with the error 'utf8' codec can't decode byte 0xe7. I've attached a full traceback below.

What is the proper encoding to work with Notebook?

UnicodeDecodeError                        Traceback (most recent call last) <ipython-input-13-92c0011919e7> in <module>()       3 ver = verif.VerificacaoNA()       4 comp, total = ver.executarCompRealFisica(DT_INI, DT_FIN) ----> 5 comp  c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\core\displayhook.pyc in __call__(self, result)     240             self.update_user_ns(result)     241             self.log_output(format_dict) --> 242             self.finish_displayhook()     243      244     def flush(self):  c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\zmq\displayhook.pyc in finish_displayhook(self)      59         sys.stdout.flush()      60         sys.stderr.flush() ---> 61         self.session.send(self.pub_socket, self.msg, ident=self.topic)      62         self.msg = None      63   c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\zmq\session.pyc in send(self, stream, msg_or_type, content, parent, ident, buffers, subheader, track, header)     557      558         buffers = [] if buffers is None else buffers --> 559         to_send = self.serialize(msg, ident)     560         flag = 0     561         if buffers:  c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\zmq\session.pyc in serialize(self, msg, ident)     461             content = self.none     462         elif isinstance(content, dict): --> 463             content = self.pack(content)     464         elif isinstance(content, bytes):     465             # content is already packed, as in a relayed message  c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\zmq\session.pyc in <lambda>(obj)      76       77 # ISO8601-ify datetime objects ---> 78 json_packer = lambda obj: jsonapi.dumps(obj, default=date_default)      79 json_unpacker = lambda s: extract_dates(jsonapi.loads(s))      80   c:\Python27-32\lib\site-packages\pyzmq-13.0.0-py2.7-win32.egg\zmq\utils\jsonapi.pyc in dumps(o, **kwargs)      70         kwargs['separators'] = (',', ':')      71  ---> 72     return _squash_unicode(jsonmod.dumps(o, **kwargs))      73       74 def loads(s, **kwargs):  c:\Python27-32\lib\json\__init__.pyc in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, encoding, default, **kw)     236         check_circular=check_circular, allow_nan=allow_nan, indent=indent,     237         separators=separators, encoding=encoding, default=default, --> 238         **kw).encode(obj)     239      240   c:\Python27-32\lib\json\encoder.pyc in encode(self, o)     199         # exceptions aren't as detailed.  The list call should be roughly     200         # equivalent to the PySequence_Fast that ''.join() would do. --> 201         chunks = self.iterencode(o, _one_shot=True)     202         if not isinstance(chunks, (list, tuple)):     203             chunks = list(chunks)  c:\Python27-32\lib\json\encoder.pyc in iterencode(self, o, _one_shot)     262                 self.key_separator, self.item_separator, self.sort_keys,     263                 self.skipkeys, _one_shot) --> 264         return _iterencode(o, 0)     265      266 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,  UnicodeDecodeError: 'utf8' codec can't decode byte 0xe7 in position 199: invalid continuation byte

886

asked Mar 14 '13 21:03

Adriano Almeida

1 Answers

I had the same problem recently, and indeed setting the default encoding to UTF-8 did the trick:

import sys reload(sys) sys.setdefaultencoding("utf-8")

Running sys.getdefaultencoding() yielded 'ascii' on my environment (Python 2.7.3), so I guess that's the default.

Also see this related question and Ian Bicking's blog post on the subject.

159

answered Feb 18 '23 22:02

assaflavi

Related questions
                            
                                android.content.ActivityNotFoundException: No Activity found to handle Intent splash screen
                            
                                Setting up JS debugging with IntelliJ/WebStorm and PhantomJS/Casper
                            
                                Composer packages, autoloading non-class based files
                            
                                AWS s3 bucket policy invalid group principal
                            
                                Beginning OpenCL tutorials? [closed]
                            
                                How can I subscribe to updates of a package on PyPI?
                            
                                Python Recursion within Class
                            
                                What are the differences between private jre and public jre?
                            
                                css - align two divs left and right on same line
                            
                                setitem and getitem -- python
                            
                                How to highlight a part of text in textarea
                            
                                How do we track Javascript errors? Do the existing tools actually work?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With