Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Preserving attribute whitespace

Disclaimer: the following is a sin against XML. That's why I'm trying to change it with XSLT :)

My XML currently looks like this:

<root>
    <object name="blarg" property1="shablarg" property2="werg".../>
    <object name="yetanotherobject" .../>
</root>

Yes, I'm putting all the textual data in attributes. I'm hoping XSLT can save me; I want to move toward something like this:

<root>
    <object>
        <name>blarg</name>
        <property1>shablarg</name>
        ...
    </object>
    <object>
        ...
    </object>
</root>

I've actually got all of this working so far, with the exception that my sins against XML have been more... exceptional. Some of the tags look like this:

<object description = "This is the first line

This is the third line.  That second line full of whitespace is meaningful"/>

I'm using xsltproc under linux, but it doesn't seem to have any options to preserve whitespace. I've attempted to use xsl:preserve-space and xml:space="preserve" to no avail. Every option I've found seems to apply to keeping whitespace within the elements themselves, but not the attributes. Every single time, the above gets changed to:

This is the first line This is the third line.  That second line full of whitespace is meaningful

So the question is, can I preserve the attribute whitespace?

like image 414
Atiaxi Avatar asked Nov 04 '08 00:11

Atiaxi


People also ask

Does XML preserve whitespace?

In XML documents, there are two types of whitespace: Significant whitespace is part of the document content and should be preserved. Insignificant whitespace is used when editing XML documents for readability. These whitespaces are typically not intended for inclusion in the delivery of the document.

What does XML space Preserve mean?

When xml:space is used on an element with a value of preserve , the whitespace in that element's content must be preserved as is by the application that processes it. The whitespace is always passed on to the processing application, but xml:space provides the application with a hint regarding how to process it.

Can XML attribute values have spaces?

The default value of the xml:space attribute is the literal value "default" . For the value "default" , or if xml:space is not indicated at all, the behavior of significant white-space parsing is the default handling, as defined in the topic White-space processing in XAML.

What does white space Nowrap mean?

nowrap. Collapses white space as for normal , but suppresses line breaks (text wrapping) within the source. pre. Sequences of white space are preserved.


3 Answers

This is actually a raw XML parsing problem, not something XSLT can help you with. An XML parse must convert the newlines in that attribute value to spaces, as per ‘3.3.3 Attribute-Value Normalization’ in the XML standard. So anything currently reading your description attributes and keeping the newlines in is doing it wrong.

You may be able to recover the newlines by pre-processing the XML to escape the newlines to & #10; character references, as long as you haven't also got newlines where charrefs are disallowed, such as inside tag bodies. Charrefs should survive as control characters through to the attribute value, where you can then turn them into text nodes.

like image 133
bobince Avatar answered Oct 24 '22 05:10

bobince


According to the Annotated XML Spec, white space in attribute values are normalized by the XML processor (See the (T) annotation on 3.3.3). So, it looks like the answer is probably no.

like image 33
James Sulak Avatar answered Oct 24 '22 06:10

James Sulak


As others have pointed out, the XML spec doesn't allow for the preservation of spaces in attributes. In fact, this is one of the few differentiators between what you can do with attributes and elements (the other main one being that elements can contain other tags while attributes cannot).

You will have to process the file outside of XML first in order to preserve the spaces.

like image 25
Ned Batchelder Avatar answered Oct 24 '22 07:10

Ned Batchelder