Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to handle HTML entity nbsp in XSLT. Without changing the input file

Tags:

html

xml

xslt

I am trying to convert an HTML file into XML file using XSLT (Using Oxygen 9.0 for transformation).

When I configure and run the XSLT transformation with the HTML file then Oxygen outputs

The entity 'nbsp' was referenced,but not declared.

My input html file is:

<div><span>&nbsp;some text</span></div>

Note: I want to know how handle that entity only using the XSLT, I don't want to make any changes to the input file.

like image 483
Ramesh Avatar asked Dec 10 '22 02:12

Ramesh


1 Answers

You could use XML Entities to create an XML file that defines the nbsp entity, and includes the (broken) XML fragment.

For example, assume that your fragment is saved as a file called: "invalid.xml"

<div><span>&nbsp;some text</span></div>

Create an XML file like this:

<!DOCTYPE wrapper [
   <!ENTITY nbsp "&#160;">
   <!ENTITY invalid-xml-document SYSTEM "./invalid.xml">
]><wrapper>
&invalid-xml-document;</wrapper>

When it that file gets parsed, it will have defined the nbsp entity, include the content from the "invalid.xml", and resolve the nbsp entity properly. The result is this:

<wrapper>
  <div>
    <span> some text</span> 
  </div>
</wrapper>

Then, just adjust your XSLT to accomodate the new document element (in this example the element <wrapper>).

like image 188
Mads Hansen Avatar answered May 01 '23 20:05

Mads Hansen