Is this XML "valid"?
<?xml version="1.0"?>
<p class="leaders">
Todd
<span class="leader-type">.</span>
R
<span class="leader-type">.</span>
Colas
</p>
I've never seen an XML doc with multiple "values" for a node like this does for the <p>
node.
How do I parse out the three values for <p>
with TXMLDocument? And how to traverse to the <span>
nodes?
Finally...how do I create an XML document like this with TXMLDocument????
Help!!!!
When you say, is it valid, I think you mean: is it well-formed? (We can't tell whether it is valid without a DTD or schema).
Yes, it is well-formed. It is a perfecly normal example of a document containing mixed content, which is what XML is designed for.
I can't answer your questions about TXMLDocument because I've never heard of it: presumably it's part of a delphi XML library.
Yes, it is valid XML. To parse it, you have to understand that XML is represented as a tree of nodes. That XML would parse into the following tree structure.
p
|_ attributes
| |_ "class"="leaders"
|
|_ children
|_ #text "Todd"
|
|_ span
| |_ attributes
| | |_ "class"="leader-type"
| |
| |_ children
| |_ #text "."
|
|_ #text "R"
|
|_ span
| |_ attributes
| | |_ "class"="leader-type"
| |
| |_ children
| |_ #text "."
|
|_ #text "Colas"
Each attribute and child node is represents as a separate IXMLNode
interface in the TXMLDocument
. As you can see, the plain text portions are separated into their own #text
nodes.
Once you have loaded the XML into TXMLDocument
, the TXMLDocument.DocumentElement
property represents the <p>
node. That node's AttributeNodes
property contains a "class" node, and its ChildNodes
property contains the first level of #text
and <span>
nodes. The <span>
nodes have their own AttributeNodes
and ChildNodes
lists, and so on. So to parse this, you would iterate through the tree looking for the #text
nodes, using the <span>
nodes to manipulate the text as needed.
To create such a document, you simply create the individual nodes as needed, eg:
Doc.Active := False;
Doc.Active := True;
Node := Doc.AddChild('p');
Node.Attributes['class'] := 'leaders';
Child := Doc.CreateNode('Todd', ntText);
Node.ChildNodes.Add(Child);
Child := Node.AddChild('span');
Child.Attributes['class'] := 'leader-type';
Child.Text := '.';
Child := Doc.CreateNode('R', ntText);
Node.ChildNodes.Add(Child);
Child := Node.AddChild('span');
Child.Attributes['class'] := 'leader-type';
Child.Text := '.';
Child := Doc.CreateNode('Colas', ntText);
Node.ChildNodes.Add(Child);
Doc.SaveTo...(...); // generate the XML to your preferred output
If you want whitespace/linebreaks to appear in the XML output, simply include those characters in the content of the #text
nodes. When parsing XML into TXMLDocument
, unnecessary whitespace is stripped off by default. If you want to preserve it, enable the poPreserveWhiteSpace
flag in the ParseOptions
property before loading the XML.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With