I'm trying to use utf-8 characters when rendering a template with Jinja2. Here is how my template looks like:
<!DOCTYPE HTML> <html manifest="" lang="en-US"> <head> <meta charset="UTF-8"> <title>{{title}}</title> ...
The title variable is set something like this:
index_variables = {'title':''} index_variables['title'] = myvar.encode("utf8") template = env.get_template('index.html') index_file = open(preview_root + "/" + "index.html", "w") index_file.write( template.render(index_variables) ) index_file.close()
Now, the problem is that myvar is a message read from a message queue and can contain those special utf8 characters (ex. "Séptimo Cine").
The rendered template looks something like:
... <title>S\u00e9ptimo Cine</title> ...
and I want it to be:
... <title>Séptimo Cine</title> ...
I have made several tests but I can't get this to work.
I have tried to set the title variable without .encode("utf8"), but it throws an exception (ValueError: Expected a bytes object, not a unicode object), so my guess is that the initial message is unicode
I have used chardet.detect to get the encoding of the message (it's "ascii"), then did the following: myvar.decode("ascii").encode("cp852"), but the title is still not rendered correctly.
I also made sure that my template is a UTF-8 file, but it didn't make a difference.
Any ideas on how to do this?
Jinja is using Unicode internally which means that you have to pass Unicode objects to the render function or bytestrings that only consist of ASCII characters.
Each UTF can represent any Unicode character that you need to represent. UTF-8 is based on 8-bit code units. Each character is encoded as 1 to 4 bytes. The first 128 Unicode code points are encoded as 1 byte in UTF-8.
UTF-8 is a character encoding - a way of converting from sequences of bytes to sequences of characters and vice versa. It covers the whole of the Unicode character set.
UTF-8 encodes Unicode characters into a sequence of 8-bit bytes. The standard has a capacity for over a million distinct codepoints and is a superset of all characters in widespread use today. By comparison, ASCII (American Standard Code for Information Interchange) includes 128 character codes.
TL;DR:
template.render()
This had me puzzled for a while. Because you do
index_file.write( template.render(index_variables) )
in one statement, that's basically just one line where Python is concerned, so the traceback you get is misleading: The exception I got when recreating your test case didn't happen in template.render(index_variables)
, but in index_file.write()
instead. So splitting the code up like this
output = template.render(index_variables) index_file.write(output)
was the first step to diagnose where exactly the UnicodeEncodeError
happens.
Jinja returns unicode whet you let it render the template. Therefore you need to encode the result to a bytestring before you can write it to a file:
index_file.write(output.encode('utf-8'))
The second error is that you pass in an utf-8
encoded bytestring to template.render()
- Jinja wants unicode. So assuming your myvar
contains UTF-8, you need to decode it to unicode first:
index_variables['title'] = myvar.decode('utf-8')
So, to put it all together, this works for me:
# -*- coding: utf-8 -*- from jinja2 import Environment, PackageLoader env = Environment(loader=PackageLoader('myproject', 'templates')) # Make sure we start with an utf-8 encoded bytestring myvar = 'Séptimo Cine' index_variables = {'title':''} # Decode the UTF-8 string to get unicode index_variables['title'] = myvar.decode('utf-8') template = env.get_template('index.html') with open("index_file.html", "wb") as index_file: output = template.render(index_variables) # jinja returns unicode - so `output` needs to be encoded to a bytestring # before writing it to a file index_file.write(output.encode('utf-8'))
Try changing your render command to this...
template.render(index_variables).encode( "utf-8" )
Jinja2's documentation says "This will return the rendered template as unicode string."
http://jinja.pocoo.org/docs/api/?highlight=render#jinja2.Template.render
Hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With