Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby - UTF-8 file encoding

We all know the "magical"

# encoding: utf-8 

line. But I've seen several other alternative notations, some of them pretty wild. Do you know or use any of those? Is there some more general rule of what is acceptable?

Edit: Ok, apparently there are exactly 3 ways:

# encoding: UTF-8  # coding: UTF-8  # -*- coding: UTF-8 -*- 

(more at http://blog.grayproductions.net/articles/ruby_19s_three_default_encodings)

like image 239
PJK Avatar asked Jun 22 '11 18:06

PJK


People also ask

How do I encode in Ruby?

Ruby defaults to UTF-8 as its encoding so if it is opening up files from the operating system and the default is different from UTF-8, it will transcode the input from that encoding to UTF-8. If this isn't desirable, you may change the default internal encoding in Ruby with Encoding.

How many bytes is a character in Ruby?

Ruby uses utf-8 encoding by default now and utf-8 was specifically designed so that its first codepoints (0-127) are exactly the same as in ASCII encoding. ASCII is an encoding with one-byte chars, so in examples in your question methods bytes and codepoints return the same values, coincindentally.

How do you write to a file in Ruby?

We have various ways and methods available in Ruby to write the contents on the file like we can use new, we can use open in append mode and write mode, we can use the direct-write method (write is also available in a various mode like append mode and write mode).


1 Answers

No, there are not "exactly 3 ways" to specify the 'magic comment' -- there are an infinite number of them. Any comment on the first line that contains coding: will work, according to JEG2:

... the preferred way to set your source Encoding ... it's called a magic comment. If the first line of your code is a comment that includes the word coding, followed by a colon and space, and then an Encoding name, the source Encoding for that file is changed to the indicated Encoding.

So, any of these should work:

# coding: UTF-8 # encoding: UTF-8 # zencoding: UTF-8 # vocoding: UTF-8 # fun coding: UTF-8 # decoding: UTF-8 # 863280148705622662 coding: UTF-8 0072364213 # It was the night before Christmas and all through the house, not a creature was coding: UTF-8, not even with a mouse. 
like image 197
David J. Avatar answered Sep 24 '22 13:09

David J.