Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ActionView::Template::Error (incompatible character encodings: UTF-8 and ASCII-8BIT)

I am using Ruby 1.9.2, Rails 3.0.4/3.0.5 and Phusion Passenger 3.0.3/3.0.4. My templates are written in HAML and I am using the MySQL2 gem. I have a controller action that when passed a parameter that has a special character, like an umlaut, gives me the following error:

ActionView::Template::Error (incompatible character encodings: UTF-8 and ASCII-8BIT)

The error points to the first line of my HAML template, which has the following code on it:

<!DOCTYPE html>

My understanding is that this is caused because I have a UTF-8 string that is being concatenated with an ASCII-8BIT string, but I can't for the life of me figure out what that ASCII-8BIT string is. I have checked that the params in the action are encoded using UTF-8 and I have added an encoding: UTF-8 declaration to the top of the HAML template and the ruby files and I still get this error. My application.rb file has a config.encoding = "UTF-8" declaration in it as well and the following all result in UTF-8:

ENV['LANG']
__ENCODING__
Encoding.default_internal
Encoding.default_external

Here's the kicker: I cannot reproduce this result locally on my Mac-OSX using standalone passenger or mongrel in either development or production. I can only reproduce it on a production server running nginx+passenger on linux. I have verified in the production server's console that the latter mentioned commands all result in UTF-8 as well.

Have you experienced this same error and how did you solve it?

like image 375
Pan Thomakos Avatar asked Mar 07 '11 07:03

Pan Thomakos


People also ask

Is UTF-8 backwards compatible with ASCII?

UTF-8 is backward-compatible with ASCII and can represent any standard Unicode character. The first 128 UTF-8 characters precisely match the first 128 ASCII characters (numbered 0-127), meaning that existing ASCII text is already valid UTF-8. All other characters use two to four bytes.

Is UTF-8 the same as ASCII?

For characters represented by the 7-bit ASCII character codes, the UTF-8 representation is exactly equivalent to ASCII, allowing transparent round trip migration. Other Unicode characters are represented in UTF-8 by sequences of up to 6 bytes, though most Western European characters require only 2 bytes3.

What characters are not allowed in UTF-8?

0xC0, 0xC1, 0xF5, 0xF6, 0xF7, 0xF8, 0xF9, 0xFA, 0xFB, 0xFC, 0xFD, 0xFE, 0xFF are invalid UTF-8 code units. A UTF-8 code unit is 8 bits.

Can ASCII be read as UTF-8?

Any text file encoded in ASCII can be decoded as UTF-8 to get exactly the same result.


2 Answers

After doing some debugging I found out the issue occurs when using the ActionDispatch::Request object which happens to have strings that are all coded in ASCII-8BIT, regardless of whether my app is coded in UTF-8 or not. I do not know why this only happens when using a production server on Linux, but I'm going to assume it's some quirk in Ruby or Rails since I was unable to reproduce this error locally. The error occurred specifically because of a line like this:

@current_path = request.env['PATH_INFO']

When this instance variable was printed in the HAML template it caused an error because the string was encoded in ASCII-8BIT instead of UTF-8. To solve this I did the following:

@current_path = request.env['PATH_INFO'].dup.force_encoding(Encoding::UTF_8)

Which forced @current_path to use a duplicated string that was forced into the proper UTF-8 encoding. This error can also occur with other request related data like request.headers.

like image 101
Pan Thomakos Avatar answered Oct 16 '22 06:10

Pan Thomakos


Mysql could be the source of troublesome ascii. Try putting the following in initializer to at least eliminate this possibility:

require 'mysql'

class Mysql::Result
  def encode(value, encoding = "utf-8")
    String === value ? value.force_encoding(encoding) : value
  end

  def each_utf8(&block)
    each_orig do |row|
      yield row.map {|col| encode(col) }
    end
  end
  alias each_orig each
  alias each each_utf8

  def each_hash_utf8(&block)
    each_hash_orig do |row|
      row.each {|k, v| row[k] = encode(v) }
      yield(row)
    end
  end
  alias each_hash_orig each_hash
  alias each_hash each_hash_utf8
end



edit

This may not be applicable to mysql2 gem. Works for mysql however.

like image 34
artemave Avatar answered Oct 16 '22 04:10

artemave