I need to be able to reference named HTML entities like •
instead of the Unicode alternative •
in an XML document. I have control over some parts of the XML document, such as defining the DOCTYPE
, but doing a find-and-replace in the actual XML is not an option. I can get some elements like
and &
by including the XHTML transitional DOCTYPE, but I need to define more manually. How do I do this?
-- EDIT --
Thanks to Jim's answer, here's what I ended up with. This is great because I can utilize the XHTML transitional entities, and also add my own:
<!DOCTYPE
html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
[
<!ENTITY bull "•">
<!ENTITY ldquo "“">
<!ENTITY rdquo "”">
... etc ...
]
>
In XML, character and entity references are formed by surrounding a numerical value or a name with & and ; —for example, © is a decimal character reference and © is an entity reference.
This means, entities are the placeholders in XML. These can be declared in the document prolog or in a DTD. There are different types of entities and in this chapter we will discuss Character Entity. Both, HTML and XML, have some symbols reserved for their use, which cannot be used as content in XML code.
What are XML entities? XML entities are a way of representing an item of data within an XML document, instead of using the data itself. Various entities are built in to the specification of the XML language. For example, the entities < and > represent the characters < and > .
If you can modify the the XML to include an inline DTD you can define the entities there:
<!DOCTYPE yourRootElement [
<!ENTITY bull "•">
....
]>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With