Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any difference between 'valid xml' and 'well formed xml'?

I wasn't aware of a difference, but a coworker says there is, although he can't back it up. What's the difference if any?

like image 288
user18931 Avatar asked Sep 25 '08 16:09

user18931


People also ask

What is difference between well formed XML and valid XML?

The difference between well-formed and valid XML is simple: Valid XML has a DTD associated with it and has been verified against all the rules contained in the DTD in addition to being well-formed. Merely well-formed XML, on the other hand, is not necessarily valid, although it may be.

What is well-formed and valid XML document?

An XML document with correct syntax is called "Well Formed". An XML document validated against a DTD is both "Well Formed" and "Valid".

Can a valid XML document not be well-formed?

If an XML document is not well-formed, an XML processor signals an error and stops normal processing. It is crucial that you understand the well-formedness constraints in XML 1.0, to ensure that the XML documents that you create will be processed correctly and without errors.

What does it mean for XML to be valid?

XML validation is the process of checking a document written in XML (eXtensible Markup Language) to confirm that it is both well-formed and also "valid" in that it follows a defined structure. A well-formed document follows the basic syntactic rules of XML, which are the same for all XML documents.


2 Answers

There is a difference, yes.

XML that adheres to the XML standard is considered well formed, while xml that adheres to a DTD is considered valid.

like image 79
Kilhoffer Avatar answered Oct 09 '22 21:10

Kilhoffer


Well-formed vs Valid XML

Well-formed means that a textual object meets the W3C requirements for being XML.

Valid means that well-formed XML meets additional requirements given by a specified schema.


Official Definitions

Per the W3C Recommendation for XML:

[Definition: A data object is an XML document if it is well-formed, as defined in this specification. In addition, the XML document is valid if it meets certain further constraints.]


Observations:

  • A document that is not well-formed is not XML. (Well-formed XML is commonly used but technically redundant.)
  • Being valid implies being well-formed.
  • Being well-formed does not imply being valid.
  • Although the W3C Recommendation for XML defines validity to be against a DTD, conventional use allows the term to be applied for conformance to XML schemas specified via XSD, RELAX NG, Schematron, or other methods.

Examples of what causes a document to be...

Not well-formed:

  • An element lacks a closing tag (and is not self-closing).
  • Elements overlap without proper nesting: <a><b></a></b>
  • An attribute value is missing a closing quote that matches the opening quote.
  • < or & are used in content rather than &lt or &amp;.
  • Multiple root elements exist.
  • Multiple XML declarations exist, or an XML declaration appears other than at the top of the document.

Invalid

  • An element or attribute is missing but required by the XML schema.
  • An element or attribute is used but undefined by the XML schema.
  • The content of an element does not match the content specified by the XML schema.
  • The value of an attribute does not match the type specified by the XML schema.

Namespace-Well-Formed

Technically, colon characters are permitted in component names in XML. However, colons should only be used in names for namespace purposes:

Note:

The Namespaces in XML Recommendation [XML Names] assigns a meaning to names containing colon characters. Therefore, authors should not use the colon in XML names except for namespace purposes, but XML processors must accept the colon as a name character.

Therefore, another term, namespace-well-formed, is defined in the Namespaces in XML 1.0 W3C Recommendation that implies all of the XML rules for well-formedness plus those governing namespaces and namespace prefixes.

Colloquially, the term well-formed is often used where namespace-well-formed would be more precise. However, this is a minor technical manner of less practical consequence than the distinction between well-formed vs valid XML described in this answer.

like image 28
kjhughes Avatar answered Oct 09 '22 21:10

kjhughes