I am using Ruby 1.9.2, Rails 3.0.4/3.0.5 and Phusion Passenger 3.0.3/3.0.4. My templates are written in HAML and I am using the MySQL2 gem. I have a controller action that when passed a parameter that has a special character, like an umlaut, gives me the following error:
ActionView::Template::Error (incompatible character encodings: UTF-8 and ASCII-8BIT)
The error points to the first line of my HAML template, which has the following code on it:
<!DOCTYPE html>
My understanding is that this is caused because I have a UTF-8 string that is being concatenated with an ASCII-8BIT string, but I can't for the life of me figure out what that ASCII-8BIT string is. I have checked that the params in the action are encoded using UTF-8 and I have added an encoding: UTF-8 declaration to the top of the HAML template and the ruby files and I still get this error. My application.rb file has a config.encoding = "UTF-8"
declaration in it as well and the following all result in UTF-8:
ENV['LANG']
__ENCODING__
Encoding.default_internal
Encoding.default_external
Here's the kicker: I cannot reproduce this result locally on my Mac-OSX using standalone passenger or mongrel in either development or production. I can only reproduce it on a production server running nginx+passenger on linux. I have verified in the production server's console that the latter mentioned commands all result in UTF-8 as well.
Have you experienced this same error and how did you solve it?
UTF-8 is backward-compatible with ASCII and can represent any standard Unicode character. The first 128 UTF-8 characters precisely match the first 128 ASCII characters (numbered 0-127), meaning that existing ASCII text is already valid UTF-8. All other characters use two to four bytes.
For characters represented by the 7-bit ASCII character codes, the UTF-8 representation is exactly equivalent to ASCII, allowing transparent round trip migration. Other Unicode characters are represented in UTF-8 by sequences of up to 6 bytes, though most Western European characters require only 2 bytes3.
0xC0, 0xC1, 0xF5, 0xF6, 0xF7, 0xF8, 0xF9, 0xFA, 0xFB, 0xFC, 0xFD, 0xFE, 0xFF are invalid UTF-8 code units. A UTF-8 code unit is 8 bits.
Any text file encoded in ASCII can be decoded as UTF-8 to get exactly the same result.
After doing some debugging I found out the issue occurs when using the ActionDispatch::Request object which happens to have strings that are all coded in ASCII-8BIT, regardless of whether my app is coded in UTF-8 or not. I do not know why this only happens when using a production server on Linux, but I'm going to assume it's some quirk in Ruby or Rails since I was unable to reproduce this error locally. The error occurred specifically because of a line like this:
@current_path = request.env['PATH_INFO']
When this instance variable was printed in the HAML template it caused an error because the string was encoded in ASCII-8BIT instead of UTF-8. To solve this I did the following:
@current_path = request.env['PATH_INFO'].dup.force_encoding(Encoding::UTF_8)
Which forced @current_path
to use a duplicated string that was forced into the proper UTF-8 encoding. This error can also occur with other request related data like request.headers
.
Mysql could be the source of troublesome ascii. Try putting the following in initializer to at least eliminate this possibility:
require 'mysql'
class Mysql::Result
def encode(value, encoding = "utf-8")
String === value ? value.force_encoding(encoding) : value
end
def each_utf8(&block)
each_orig do |row|
yield row.map {|col| encode(col) }
end
end
alias each_orig each
alias each each_utf8
def each_hash_utf8(&block)
each_hash_orig do |row|
row.each {|k, v| row[k] = encode(v) }
yield(row)
end
end
alias each_hash_orig each_hash
alias each_hash each_hash_utf8
end
edit
This may not be applicable to mysql2 gem. Works for mysql however.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With