I have a string containing some XML. For example:
<foo>
<bar>this is < than this</bar>
</foo>
and I need to remove the illagal characters from it before I load it into an XmlDocument.
any thoughts.
Thanks in advance
I have a string containing some Xml.
No you don't. You have some XML-like text that is not well-formed. Once it's all glued together like that, it's hard work finding the special characters. Oh, you could try to look for "< " or " >", but those could appear anyway. My advice is to go back a step and look where that string came from. Change that code so it deals with special characters.
In the absence of any other options, I would probably ignore XML tools for the moment (because they'll throw up when you try to give them the string) and do some sort of running count of open/close (odd/even for quotes) on special characters. Once you've encountered an <, you aren't allowed another one until you meet a >, for example. Unfortunately you can't use < and the like in attributes, so I don't know what you'll do with <foo p1="a<a"> but at least you could fix <foo>a<A</foo>. (Assuming they would never put a < in a tag name, meeting the second one means you need to back up and escape the first one.) Once you've encountered a >, you can't have another one. And so on. My sympathies.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With