I am attempting to implement this Jinja nl2br
filter. It is working correctly except that the <br>
's it adds are being escaped. This is weird to me because the <p>
's are not being escaped and they are all in the same string.
I am using flask so the Jinja autoescape
is enabled. I was really hopeful when I found this guy saying the autoescape
and the escape(value)
may have been causing double escaping, but removing the escape()
did not help.
Here is my modified code and it's output:
@app.template_filter()
@evalcontextfilter
def nl2br(eval_ctx, value):
_paragraph_re = re.compile(r'(?:\r\n|\r(?!\n)|\n){2,}')
result = u'\n\n'.join(u'<p>%s</p>' % escape(p.replace(u'\r\n', u'<br>\n')) for p in _paragraph_re.split(value))
if eval_ctx.autoescape:
result = Markup(result)
return result
input:
u'1\r\n2\r\n\r\n3\r\n4\r\n\r\n5\r\n6\r\n7'
output:
<p>1<br>
2</p>
<p>3<br>
4</p>
<p>5<br>
6<br>
7</p>
desired output:
<p>1<br>2</p>
<p>3<br>4</p>
<p>5<br>6<br>7</p>
What could be causing the <br>
's to be escaped but allowing the <p>
's?
The nl2br
filter doesn't handle Markup objects correctly. If value
is Markup, then the inserted <br>
tags will be escaped. To fix it, the <br>
tag must be Markup too:
@app.template_filter()
@evalcontextfilter
def nl2br(eval_ctx, value):
_paragraph_re = re.compile(r'(?:\r\n|\r(?!\n)|\n){2,}')
result = u'\n\n'.join(u'<p>%s</p>' % p.replace(u'\n', Markup('<br>\n'))
for p in _paragraph_re.split(value))
if eval_ctx.autoescape:
result = Markup(result)
return result
Note: I normalized line endings to \n
.
Here's a longer explanation of what's happening:
Splitting Markup
objects, produces many Markup
objects:
>>> Markup("hello there").split()
[Markup(u'hello'), Markup(u'there')]
According to Jinja's documentation for Markup:
Operations on a markup string are markup aware which means that all arguments are passed through the escape() function.
Looking back at the main transformation of nl2br
, we can see what's happening and why it didn't work:
result = u'\n\n'.join(u'<p>%s</p>' % p.replace(u'\n', u'<br>\n')
for p in _paragraph_re.split(value))
u'\n\n'
and u'<br>\n'
are unicode strings, but p
is Markup
having been split from value
, which is a Markup object. p.replace
tries to add a unicode string to Markup object p
, but the Markup object correctly intercepts and escapes the string first.
The <p>
tags aren't escaped because of how Python assembles the final string, since the %
formatting method is called on a unicode string, it uses the unicode representation of the elements passed to it. The Markup elements have already been declared safe, so they aren't escaped any further. result
ends up as a unicode string.
The other 2 answers here at the time I'm writing this will not escape <br/>
tags, but they are vulnerable to XSS. Test it out with this input string:
';alert(String.fromCharCode(88,83,83))//';alert(String.fromCharCode(88,83,83))//";
alert(String.fromCharCode(88,83,83))//";alert(String.fromCharCode(88,83,83))//--
></SCRIPT>">'><SCRIPT>alert(String.fromCharCode(88,83,83))</SCRIPT>
The original nl2br jinja snippet by Dan Jacob is almost there:
import re
from jinja2 import evalcontextfilter, Markup, escape
_paragraph_re = re.compile(r'(?:\r\n|\r|\n){2,}')
app = Flask(__name__)
@app.template_filter()
@evalcontextfilter
def nl2br(eval_ctx, value):
result = u'\n\n'.join(u'<p>%s</p>' % p.replace('\n', '<br>\n') \
for p in _paragraph_re.split(escape(value)))
if eval_ctx.autoescape:
result = Markup(result)
return result
The code above already works as long as value
is just a string. It only fails if value
is already a Markup
object, since then the .replace()
call causes the '<br>'
string to get escaped. This follows from the way Jinja2 generally handles escaping; Markup
objects are presumed to be safe, normal string objects are presumed to be unsafe, and so operations that combine the two automatically invoke escaping on the normal string object.
To fix this, just combine that with @joemaller's answer of creating a Markup('<br/>\n')
object. I.e:
result = u'\n\n'.join(u'<p>%s</p>' % p.replace('\n', Markup('<br/>\n')) \
for p in _paragraph_re.split(escape(value)))
Did you try it with the escape removed? because the below works for me?
@app.template_filter()
@evalcontextfilter
def nl2br(eval_ctx, value):
_paragraph_re = re.compile(r'(?:\r\n|\r(?!\n)|\n){2,}')
result = u'\n\n'.join(u'<p>%s</p>' % p.replace(u'\r\n', u'<br/>') for p in _paragraph_re.split(value))
if eval_ctx.autoescape:
result = Markup(result)
return result
When used in a template like below :
{{ '1\r\n2\r\n\r\n3\r\n4\r\n\r\n5\r\n6\r\n7' | nl2br}}
Gives me the output below
<p>1<br/>2</p>
<p>3<br/>4</p>
<p>5<br/>6<br/>7</p>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With