Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Prevent XmlReader from expanding XML entities

Is there a way to prevent .NET's XmlReader class from expanding XML entities into their value when reading the content?

For instance, suppose the following XML is used as input:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE author PUBLIC "ISO 8879:1986//ENTITIES Added Latin 1//EN//XML" "http://www.oasis-open.org/docbook/xmlcharent/0.3/iso-lat1.ent" >
<author>&aacute;</author>

Let's assume it is not possible to reach the external OASIS DTD needed for the expansion of the aacute entity. I would like the reader to read, in sequence, the author element, then the aacute node of type EntityReference, and finally the author end element, without throwing any errors. How can I achieve this?

UPDATE: I also want to prevent the expansion of character entities such as &#x00E1;.

like image 330
Gabriel S. Avatar asked Oct 30 '22 10:10

Gabriel S.


1 Answers

One way to do that is use `XmlTextReader', like this:

using (var reader = new XmlTextReader(@"your url"))
{
    // note this
    reader.EntityHandling = EntityHandling.ExpandCharEntities;
    while (reader.Read())
    {
        // here it will be EntityReference with no exceptions
    }
}

If that is not an option - you can do the same with XmlReader, but some reflection will be required (at least I don't aware of another way):

using (var reader = XmlReader.Create(@"your url", new XmlReaderSettings() {
    DtdProcessing = DtdProcessing.Ignore // or Parse
})) {
     // get internal property which has the same function as above in XmlTextReader
     reader.GetType().GetProperty("EntityHandling", BindingFlags.Instance | BindingFlags.NonPublic).SetValue(reader, EntityHandling.ExpandCharEntities);
     while (reader.Read()) {
          // here it will be EntityReference with no exceptions
     }
 }
like image 87
Evk Avatar answered Nov 15 '22 04:11

Evk