I have a very simple Sinatra app running on Ruby 1.9.3 that uses ERB and markdown templates. I've stripped it right down to demonstrate the problem.
This is running Sinatra 1.3.2 on Mac OS X Snow Leopard. For the markdown I'm using rdiscount 1.6.8.
The main Ruby file contains
get '/services' do
erb :services
end
The services.erb file has the following in it
<%= markdown :'content/service1' %>
£
Inside the markdown file I have just a single line
£
When I run the Sinatra app and load the 'services' page I get the exception Encoding::CompatibilityError at /services incompatible character encodings: UTF-8 and ASCII-8BIT
on the second line of the ERB file (the one containing just the '£').
I've done lots of Googling and I can't for the life of me figure out why this is happening. The ERB and markdown files are UTF-8 on my local disk, but obviously they are being loaded by Sinatra and turned into strings, and I've no idea how to tell what encoding those strings are.
If I force Sinatra to use ASCII-8BIT (by adding settings.default_encoding = 'ASCII-8BIT'
to the top of my main Sinatra Ruby file) then no exception is thrown but the '£' characters come out looking wrong.
Any pointers?
This is an issue in Tilt, the templating system that Sinatra uses (and is being considered for Rails). Have a look at issues #75 and #107.
The problem is basically down to how Tilt reads template files from the disk - it uses binread
. This means that the source string that is handed to the actual template engine has an associated encoding of ASCII-8BIT
, which is basically saying that it’s unknown.
RDiscount has code to set the encoding of the output to match the input, but this isn’t much help when the input encoding is ASCII-8BIT
; the result is given the same encoding. The same thing (or something similar) happens with Kramdown, so simply switching won’t solve this.
This causes problems when the template has non-ascii characters (i.e. £
) and you try to combine the result with other utf-8 encoded strings. If the template only contains only ascii characters, it is compatible with utf-8 and Ruby can combine the two strings. If not, you get the CompatibilityError
that you see.
A possible workaround is to read the template files yourself, and pass in the resulting string with the correct encoding to Tilt:
<%= markdown File.read './views/pound.md' %>
£
By reading the file yourself with read
instead of binread
, you can ensure it has the right encoding and so is compatible with the rest of the erb
file. You may want to read the file in once and cache the contents somewhere if you try this.
An alternative workaround would be to capture the output of the markdown
method and use force_encoding
on it:
<%= markdown(:pound).force_encoding('utf-8') %>
£
This is possible because although the encoding is ASCII-8BIT
, you know that the bytes in the string really are utf-8 encoded, so you can just change the encoding.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With