Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Avoiding Python UnicodeDecodeError in Jinja's nl2br filter

I'm using Jinja2's nl2br filter, which looks like:

import re
from jinja2 import environmentfilter, Markup, escape

_paragraph_re = re.compile(r'(?:\r\n|\r|\n){2,}')

@evalcontextfilter
def nl2br(eval_ctx, value):
    result = u'\n\n'.join(u'<p>%s</p>' % p.replace('\n', '<br>\n')
                      for p in _paragraph_re.split(escape(value)))
    if eval_ctx.autoescape:
        result = Markup(result)
    return result

The problem is if "value" has anything but ascii characters (for example: "/mɒnˈtænə/" causes it to fail). I get this error:

Traceback (most recent call last):
  File "/usr/local/lib/python2.6/dist-packages/Flask-0.6.1-py2.6.egg/flask/app.py", line 889, in __call__
    return self.wsgi_app(environ, start_response)
  File "/usr/local/lib/python2.6/dist-packages/Flask-0.6.1-py2.6.egg/flask/app.py", line 879, in wsgi_app
    response = self.make_response(self.handle_exception(e))
  File "/usr/local/lib/python2.6/dist-packages/Flask-0.6.1-py2.6.egg/flask/app.py", line 876, in wsgi_app
    rv = self.dispatch_request()
  File "/usr/local/lib/python2.6/dist-packages/Flask-0.6.1-py2.6.egg/flask/app.py", line 695, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/mcrittenden/Dropbox/Code/dropdo/dropdo.py", line 105, in view
    return render_template(template, src = url, data = content)
  File "/usr/local/lib/python2.6/dist-packages/Flask-0.6.1-py2.6.egg/flask/templating.py", line 85, in render_template
    context, ctx.app)
  File "/usr/local/lib/python2.6/dist-packages/Flask-0.6.1-py2.6.egg/flask/templating.py", line 69, in _render
    rv = template.render(context)
  File "/usr/local/lib/python2.6/dist-packages/Jinja2-2.5.5-py2.6.egg/jinja2/environment.py", line 891, in render
    return self.environment.handle_exception(exc_info, True)
  File "/home/mcrittenden/Dropbox/Code/dropdo/templates/text.html", line 1, in top-level template code
    {% extends "layout.html" %}
  File "/home/mcrittenden/Dropbox/Code/dropdo/templates/layout.html", line 25, in top-level template code
    {% block content %}{% endblock %}
  File "/home/mcrittenden/Dropbox/Code/dropdo/templates/text.html", line 8, in block "content"
    {{ data|nl2br }}
  File "/home/mcrittenden/Dropbox/Code/dropdo/dropdo.py", line 26, in nl2br
    for p in _paragraph_re.split(escape(value)))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc9 in position 12: ordinal not in range(128)

What's the best to prevent the error but not remove the problem characters altogether?

like image 661
Mike Crittenden Avatar asked Feb 25 '11 17:02

Mike Crittenden


People also ask

How to add custom filters in Jinja2?

Jinja2 provides hooks for adding custom filters. These are just Python functions, so if you wrote Python function before you will be able to write your own filter as well! Aforementioned automation frameworks also support custom filters and the process of writing them is similar to vanilla Jinja2.

What is the difference between reject and select in Jinja2?

select (*args, **kwargs) - Filters the sequence by retaining only the elements passing the Jinja2 test. This filter is the opposite of reject. You can use either of those depending on what feels more natural in given scenario.

How to remove an item from a list in Jinja2?

Here we're only interested in values of name attribute: reject (*args, **kwargs) - Filters sequence of items by applying a Jinja2 test and rejecting objects succeeding the test. That is item will be removed from the final list if result of the test is true.


1 Answers

Use unicode literals everywhere.

"Unicode in Python, Completely Demystified"

like image 142
Ignacio Vazquez-Abrams Avatar answered Oct 27 '22 10:10

Ignacio Vazquez-Abrams