Parse malformed XML

Question

I'm trying to load a piece of (possibly) malformed HTML into an XMLDocument object, but it fails with XMLExceptions... since there are extra opening/closing tags, and malformed XML tags such as <img > instead of <img />

How do I get the XML to parse with all the errors in the data? Is there any XML validator that I can apply before parsing, to correct these errors? Or would handling the exception parse whatever can be parsed?

Marc Gravell · Accepted Answer

The HTML Agility Pack will parse html, rather than xhtml, and is quite forgiving. The object model will be familiar if you've used XmlDocument.

annakata · Answer

You might want to check out the answer to this question.

Basically somewhere between a .NET port of beautifulsoup and the HTML agility pack there is a way.

Parse malformed XML

Tags:

c#

parsing

xml

xml-parsing

xmldocument

Robin Rodricks

2 Answers

Marc Gravell

annakata

Recent Activity

Donate For Us

Parse malformed XML

Tags:

c#

parsing

xml

xml-parsing

xmldocument

Robin Rodricks

2 Answers

Marc Gravell

annakata

Related questions

Recent Activity

Donate For Us