Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Putting French (accented) characters in Ruby file [duplicate]

Possible Duplicate:
invalid multibyte char (US-ASCII) with Rails and Ruby 1.9

How can I put French characters in a Ruby file? Here is an error:

SyntaxError in ArticlesController#show 

    /.../app/controllers/articles_controller.rb:47: invalid multibyte char (US-ASCII)
    /.../app/controllers/articles_controller.rb:47: invalid multibyte char (US-ASCII)
    /.../app/controllers/articles_controller.rb:47: syntax error, unexpected $end, expecting '}'
    ...@article, notice: 'Article a été créé avec succes.' }

In a HTML file a put this in head and the accents work:

<!DOCTYPE html>

<head>
<meta http-equiv="content-type" content="text/html"; charset="utf8">
<meta http-equiv="Content-Script-Type" content="text/javascript">
<meta http-equiv="Content-Style-Type" content="text/css">
  <!-- ... autres mentions de l'entête de fichier ... -->
</head>
like image 746
René Avatar asked Dec 21 '22 03:12

René


2 Answers

Ruby has a special syntax for declaring the charset of a file: if you are using multibyte characters, you can use this line at the very top of your file, with no preceding whitespace

# encoding: utf-8
like image 121
Alex Avatar answered Dec 23 '22 18:12

Alex


Since Ruby 1.9, Strings always have an encoding attached. So Ruby can properly handle multi-byte characters and is able to convert between different encodings. Prior versions of Ruby basically handled strings as byte arrays which made it nearly impossible to properly handle multiple encodings.

By default, Ruby 1.9 uses US_ASCII encoding everywhere while Ruby since 2.0 uses UTF-8 by default.

Generally, you only have to change anything if you are running Ruby 1.9. If your editor saves UTF-8 files and you are running Ruby >= 2.0, everything will be fine by default.

Still, in all Ruby versions since 1.9, you can change the encoding used. There are three different default encodings you can set (which all use the respective Ruby's default encoding by default, i.e.m US_ASCII on 1.9, UTF-8 on Ruby 2.0 and newer):

  • internal encoding: The default encoding all strings are converted to. This is the encoding that strings are saved internally.
  • external encoding: When reading files, assume them to be in that encoding.
  • source encoding: Assume the ruby source code to be written in this encoding

The former two encodings can be set like this

Encoding.default_internal = 'UTF-8'
Encoding.default_external = 'UTF-8'

They are then used during all operations in the current Ruby processes lifetime.

The source encoding can be set using a "magic comment" on the first line of your ruby file (or below the shebang) like so

# encoding: UTF-8

or by starting your script using ruby -KU which also sets the default encoding to UTF-8. You can also set this in your shebang. In your specific case, you have to at least set the source encoding using one of the provided mechanisms.

See http://graysoftinc.com/character-encodings and especially http://graysoftinc.com/character-encodings/ruby-19s-three-default-encodings for some more information and background on String encodings in Ruby 1.9.

like image 23
Holger Just Avatar answered Dec 23 '22 17:12

Holger Just