A message I receive from a server contains tags and in the tags is the data I need.
I try to parse the payload as XML but illegal character exceptions are generated.
I also made use of httpUtility
and Security Utility
to escape the illegal characters, only problem is, it will escape < >
which is needed to parse the XML.
My question is, how do I parse XML when the data contained in it contains illegal non XML characters? (& -> amp;)
_
Thanks.
Example:
<item><code>1234</code><title>voi hoody & polo shirt + Mckenzie jumper</title><description>Good condition size small - medium, text me if interested</description></item>
If you have only &
as invalid character, then you can use regex to replace it with &
. We use regex to prevent replacement of already existing &
, "
, o
, etc. symbols.
Regex can be as follows:
&(?!(?:lt|gt|amp|apos|quot|#\d+|#x[a-f\d]+);)
Sample code:
string content = @"<item><code>1234 & test</code><title>voi hoody & polo shirt + Mckenzie jumper&other stuff</title><description>Good condition size small - medium, text me if interested</description></item>";
content = Regex.Replace(content, @"&(?!(?:lt|gt|amp|apos|quot|#\d+|#x[a-f\d]+);)", "&", RegexOptions.IgnoreCase);
XElement xItem = XElement.Parse(content);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With