Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Visualize XML tree structure

I have several XML files which have a similar structure but with some differences that I cannot overlook. They are all TEI documents.

I am looking for a way to outline the main structure.

Take the following text as an example:

<text xmlns="http://www.tei-c.org/ns/1.0" xml:id="d1">
<body xml:id="d2">
<div1 type="book" xml:id="d3">
<head>Songs of Innocence</head>
<pb n="4"/>
<div2 type="poem" xml:id="d4">
<head>Introduction</head>
<lg type="stanza">
<l>Piping down the valleys wild, </l>
<l>Piping songs of pleasant glee, </l>
<l>On a cloud I saw a child, </l>
<l>And he laughing said to me: </l>
</lg>

I would like to suppress the nodes of the same type and all the repeating structures:

<body xml:id="d2">
<div1 type="book" xml:id="d3">
<head>Songs of Innocence</head>
<pb n="4"/>
<div2 type="poem" xml:id="d4">
<head>Introduction</head>
<lg type="stanza">
<l>...</l>
</lg>
<lg>...</lg>

So, basically I want to reduce the XML document to its most basic structure. In this way I can figure out how to properly convert them using XSLT.

like image 698
Angelo Avatar asked Feb 26 '16 17:02

Angelo


2 Answers

Here are some options for viewing your XML in a tree structure:

  1. Open the XML in a web browser and get an outline view with collapsible elements.
  2. Open the XML in graphics view in Oxygen, QTAssistant, or XMLSpy.
  3. Use Graphviz or DotML ant build to create your own visual representations.

Note, however, that you'll need to clean up your markup. What you show doesn't qualify as XML as it's missing end tags and lacks a single root element. (XML has to be well-formed.)

like image 166
kjhughes Avatar answered Sep 30 '22 01:09

kjhughes


Using perl XML::DT, (apt-get install libxml-dt-perl if not installed), the command mkxmltype file.xml returns a compact description of the xml structure. Example

$ mkxmltype -lines=1000  a.xml 

# text ...Fri Feb 26 17:56:24 2016
text    =>  body * xml:id
body    =>  div1 * xml:id
div1    =>  tup(div2, pb, head) * type * xml:id
div2    =>  tup(head, lg) * type * xml:id
pb  =>  empty * n
head    =>  text
lg  =>  seq(l) * type
l   =>  text
like image 32
JJoao Avatar answered Sep 29 '22 23:09

JJoao