Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to set the character encoding in a yaml file

Tags:

We are working with the simple backend for the new Rails 2.2 i18n system, and I wanted to know the proper syntax for setting the encoding in a yaml file.

On other words what is the yaml for this xml:

<?xml encoding="UTF-8" ?>
like image 652
csexton Avatar asked Jan 28 '09 18:01

csexton


People also ask

How do I encode in YAML?

A YAML processor must support the UTF-16 and UTF-8 character encodings. If a character stream does not begin with a byte order mark (#FEFF), the character encoding shall be UTF-8. Otherwise it shall be either UTF-8, UTF-16 LE or UTF-16 BE as indicated by the byte order mark.

What is encoding of YAML file?

YAML is a Unicode based data serialization language. YAML stands for YAML Ain't Markup Language. YAML files use the set of printable Unicode characters (UTF-8, UTF-16 or UTF-32). YAML is used for various purposes in programming.

How do I change my UTF-8 character set?

Click Tools, then select Web options. Go to the Encoding tab. In the dropdown for Save this document as: choose Unicode (UTF-8). Click Ok.

Does YAML support Unicode?

YAML accepts the entire Unicode character set, except for some control characters, and may be encoded in any one of UTF-8, UTF-16 or UTF-32.


1 Answers

You can't define the encoding in YAML. But there's also no need to, since the encoding is done at the file level and is transparent to the YAML and its parsing. When writing a YAML document, this is all you need to remember.

On the file level, YAML 1.1 supports UTF-8 and UTF-16 but not UTF-32. The full details of the 1.1 specification is that

All characters [...] are Unicode code points. Each such code point is written as one or more octets depending on the character encoding used. Note that in UTF-16, characters above #xFFFF are written as four octets, using a surrogate pair. A YAML processor must support the UTF-16 and UTF-8 character encodings. If a character stream does not begin with a byte order mark (#FEFF), the character encoding shall be UTF-8. Otherwise it shall be either UTF-8, UTF-16 LE or UTF-16 BE as indicated by the byte order mark. On output, it is recommended that a byte order mark should only be emitted for UTF-16 character encodings. Note that the UTF-32 encoding is explicitly not supported.

For YAML 1.2, UTF-32 is supported as well.

like image 53
bzlm Avatar answered Sep 23 '22 09:09

bzlm