I have an XML document which after sending it through my XSLT no longer has line breaks before the XML attributes. So for example
<myoutertag one="a"
two="b"
three="c">
<myinnertag four="d"
five="e"/>
</myoutertag>
would become
<myoutertag one="a" two="b" three="c">
<myinnertag four="d" five="e"/>
</myoutertag>
This is of course perfectly valid XML but it's more difficult to read, especially if there are many long attribute values. From what I've read, XSLT is not able to preserve these line breaks as the XSLT processor is not passed such unimportant information.
So, what I'm looking for now is a command line based pretty printer (usable in Linux) which ideally would only change the document in that it adds line breaks between the attributes. Whether it adds one before the first attribute or not is pretty much irrelevant to me, just as long as it's more easily readable.
I'm using the input file
<?xml version="1.0" encoding="UTF-8"?>
<myoutertag one="a" two="b" three="c">
<myinnertag four="d" five="e"/>
</myoutertag>
I tried both xmllint --format test.xml
and cat test.xml | xmllint --format -
with the same result:
<?xml version="1.0" encoding="UTF-8"?>
<myoutertag one="a" two="b" three="c">
<myinnertag four="d" five="e"/>
</myoutertag>
So, the changes are:
<myinnertag>
was reduced from four spaces to two spacesI want neither of those changes. This is using libxml version 20706.
I tried the styles none
, nsgmls
, nice
, indented
, record
and record_c
. The only one that comes close is nsgmls
which will add line breaks, but the result looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<myoutertag
one="a"
two="b"
three="c"
><myinnertag
four="d"
five="e"
/></myoutertag>
So, no indentation and weird line breaking.
The output of xmlstarter fo test.xml
is the same as with xmllint
. I also tried finding something like xmlstarter -ed -P --insert "//@*" -t text -n "" -v "\\n" test.xml
but that resulted in a glibc pointer error. Not surprising I guess, as I'm trying to add text in between attributes.
This is the closest I've gotten so far. Running the command tidy -quiet -xml -indent -wrap 1 test.xml
gives me:
<?xml version="1.0"
encoding="UTF-8"?>
<myoutertag one="a"
two="b"
three="c">
<myinnertag four="d"
five="e"/>
</myoutertag>
So, if I could get it to indent some more before those attributes in new lines that would basically solve my problem (I think).
Any further suggestions?
To do a line break in HTML, use the <br> tag. Simply place the tag wherever you want to force a line break. Since an HTML line break is an empty element, there's no closing tag.
Updated: 07/31/2022 by Computer Hope. A line break is a command or sequence of control characters that returns the cursor to the next line and does not create a new paragraph. Essentially, line breaks denote the end of one line and the start of a new one.
Newline characters are most certainly allowed in well-formed XML.
If the title attribute's value contains "LF" (U+000A) characters, the content is split into multiple lines. Each "LF" (U+000A) character represents a line break.
OK, I've found a solution. The tool I used is called HTML Tidy (well, actually I used jTidy, a port of HTML Tidy to Java which therefore is portable). The tool offers many options for configuration; the one I was looking for is called indent-attributes: true
. In fact, my whole configuration file is:
add-xml-decl: true
drop-empty-paras: false
fix-backslash: false
fix-bad-comments: false
fix-uri: false
input-xml: true
join-styles: false
literal-attributes: true
lower-literals: false
output-xml: true
preserve-entities: true
quote-ampersand: false
quote-marks: false
quote-nbsp: false
indent: auto
indent-attributes: true
indent-spaces: 4
tab-size: 4
vertical-space: true
wrap: 150
char-encoding: utf8
input-encoding: utf8
newline: CRLF
output-encoding: utf8
quiet: true
The meanings of those options are explained in the Tidy manual (or the man page if you install it on a Linux system), I mostly cared about that middle block where I can set the indentation settings.
I can now call the tool using the command java -jar jtidy-r938.jar -config tidy.config test.xml
and the output will be
<?xml
version="1.0"
encoding="UTF-8"?>
<myoutertag
one="a"
two="b"
three="c">
<myinnertag
four="d"
five="e" />
</myoutertag>
Now I'm happy. :-)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With