Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to fix "Byte-Order Mark found in UTF-8 File" validation warning

I've got an xhtml page validating under xhtml strict doctype -- but, I getting this warning which I trying to understand -- and correct.

Just, how do I locate this errant "Byte-Order Mark". I'm editing my file using Visual Studio--not sure if that helps.

Warning Byte-Order Mark found in UTF-8 File.

The Unicode Byte-Order Mark (BOM) in UTF-8 encoded files is known to cause problems for some text editors and older browsers. You may want to consider avoiding its use until it is better supported.

like image 598
rsturim Avatar asked Mar 31 '10 15:03

rsturim


People also ask

Does UTF 8 require a byte order mark?

UTF-8 has the same byte order regardless of platform endianness, so a byte order mark isn't needed. However, it may occur (as the byte sequence EF BB FF ) in data that was converted to UTF-8 from UTF-16, or as a "signature" to indicate that the data is UTF-8.

What is UTF-16 Le bom?

UTF-16. In UTF-16, a BOM ( U+FEFF ) may be placed as the first character of a file or character stream to indicate the endianness (byte order) of all the 16-bit code unit of the file or stream.


1 Answers

The location part of your question is easy: The byte-order mark (BOM) will be at the very beginning of the file.

When editing the file, in the bottom status bar toward the right VS Code shows you what encoding is being used for the current file:

Status bar showing "UTF-8 with BOM"

Click it to open the command palette with the options "Reopen with encoding" and "Save with encoding":

The command palette showing the options

Click "Save with Encoding" to get a list of encodings:

Command palette showing list of file encodings such as UTF-8, UTF-16 LE, UTF-16 BE

Choosing an encoding saves the file with that encoding.

See also this note in the Unicode site's FAQ about the BOM and UTF-8 files. It has no function other than to call out that the file is, in fact, UTF-8. In particular, it has no effect on the byte order (the main reason we have BOMs), because the byte order of UTF-8 is fixed.

like image 69
T.J. Crowder Avatar answered Sep 20 '22 19:09

T.J. Crowder