Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does a valid XML file require an XML declaration?

People also ask

Is XML declaration mandatory?

The XML declaration is mandatory if the encoding of the document is anything other than UTF-8 or UTF-16. In practice, this means that documents encoded using US-ASCII can also omit the XML declaration because US-ASCII overlaps entirely with UTF-8. Only one encoding can be used for an entire XML document.

What makes an XML file valid?

Well Formed XML DocumentsXML documents must have a root element. XML elements must have a closing tag. XML tags are case sensitive. XML elements must be properly nested.

What is the XML declaration?

An XML declaration is a processing instruction that identifies a document as XML. DITA documents must begin with an XML declaration.

Which of the following is a valid XML version declaration?

xml version = "version_number" encoding = "encoding_declaration" standalone = "standalone_status" ?> Specifies the version of the XML standard used. It defines the character encoding used in the document. UTF-8 is the default encoding used.


In XML 1.0, the XML Declaration is optional. See section 2.8 of the XML 1.0 Recommendation, where it says it "should" be used -- which means it is recommended, but not mandatory. In XML 1.1, however, the declaration is mandatory. See section 2.8 of the XML 1.1 Recommendation, where it says "MUST" be used. It even goes on to state that if the declaration is absent, that automatically implies the document is an XML 1.0 document.

Note that in an XML Declaration the encoding and standalone are both optional. Only the version is mandatory. Also, these are not attributes, so if they are present they must be in that order: version, followed by any encoding, followed by any standalone.

<?xml version="1.0"?>
<?xml version="1.0" encoding="UTF-8"?>
<?xml version="1.0" standalone="yes"?>
<?xml version="1.0" encoding="UTF-16" standalone="yes"?>

If you don't specify the encoding in this way, XML parsers try to guess what encoding is being used. The XML 1.0 Recommendation describes one possible way character encoding can be autodetected. In practice, this is not much of a problem if the input is encoded as UTF-8, UTF-16 or US-ASCII. Autodetection doesn't work when it encounters 8-bit encodings that use characters outside the US-ASCII range (e.g. ISO 8859-1) -- avoid creating these if you can.

The standalone indicates whether the XML document can be correctly processed without the DTD or not. People rarely use it. These days, it is a bad to design an XML format that is missing information without its DTD.

Update:

A "prolog error/invalid utf-8 encoding" error indicates that the actual data the parser found inside the file did not match the encoding that the XML declaration says it is. Or in some cases the data inside the file did not match the autodetected encoding.

Since your file contains a byte-order-mark (BOM) it should be in UTF-16 encoding. I suspect that your declaration says <?xml version="1.0" encoding="UTF-8"?> which is obviously incorrect when the file has been changed into UTF-16 by NotePad. The simple solution is to remove the encoding and simply say <?xml version="1.0"?>. You could also edit it to say encoding="UTF-16" but that would be wrong for the original file (which wasn't in UTF-16) or if the file somehow gets changed back to UTF-8 or some other encoding.

Don't bother trying to remove the BOM -- that's not the cause of the problem. Using NotePad or WordPad to edit XML is the real problem!


Xml declaration is optional so your xml is well-formed without it. But it is recommended to use it so that wrong assumptions are not made by the parsers, specifically about the encoding used.


It is only required if you aren't using the default values for version and encoding (which you are in that example).