I have a Rails project with a lot of Cyrillic strings in it.
It worked fine on Ruby 1.8, but Ruby 1.9 assumes source files are US-ASCII-encoded unless you provide an # encoding: utf-8
comment at the top of the source file. At that point the files are not considered US-ASCII
.
Is there a simpler way to tell Ruby "This application is UTF8-encoded. Please consider all and any included source files as UTF8 unless declared otherwise"?
UPDATE:
I wrote "How to insert the encoding: UTF-8 directive automatically in Ruby 1.9 files" which appends the encoding directive automatically if it's needed.
Show activity on this post. The way I read the spec, UTF-8 is not the default encoding in an XML declaration. It is only the default encoding "for an entity which begins with neither a Byte Order Mark nor an encoding declaration".
UTF-8 is an encoding system for Unicode. It can translate any Unicode character to a matching unique binary string, and can also translate the binary string back to a Unicode character. This is the meaning of “UTF”, or “Unicode Transformation Format.”
I think you can either
-E utf-8
command line argument to ruby
, orRUBYOPT
environment variable to "-E utf-8"
In my opinion, explicit is not always better than implicit.
When almost all the source you use is UTF-8 compatible, you can easily avoid putting the magic encoding comment by using Ruby's -Ku
command-line options.
Don't confuse the "u
" parameter of the -K
options with -U
options.
-Ku : set internal and script encoding to utf-8 -U : set internal encoding to utf-8
Then, set the magic encoding comment only in scripts that need it. Remember, convention over configuration!
You can set the environment variable RUBYOPT=-Ku
See Ruby's command-line options at http://www.manpagez.com/man/1/ruby/.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With