Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Serialization of unprintable character

The following code;

var c = (char) 1;

var serializer = new XmlSerializer(typeof (string));

var writer = new StringWriter();
serializer.Serialize(writer, c.ToString()); 
var serialized = writer.ToString();

var dc = serializer.Deserialize(new StringReader(serialized));

Throws this exception in .NET 4.

Invalid Operation Exception - There is an error in XML document (2, 12). '', hexadecimal value 0x01, is an invalid character. Line 2, position 12

Am I doing something wrong? Or is there a reasonable work around?

Many thanks!

like image 375
CityView Avatar asked May 16 '11 16:05

CityView


2 Answers

There is a workaround as explained here - you can use XmlReaderSettings.CheckCharacters option to ignore validation of characters:

XmlReader xr = XmlReader.Create(new StringReader(serialized),
    new XmlReaderSettings { CheckCharacters = false });
var dc = (string)serializer.Deserialize(xr);
like image 105
mellamokb Avatar answered Oct 21 '22 14:10

mellamokb


You're trying to serialize characters which can't be represented within XML. Unfortunately they break XML serialization. I don't know of any workarounds for this other than writing your own escaping code.

On the other hand, actual uses for such characters (ASCII characters before space, other than tab, carriage return and line feed IIRC) are relatively rare - you may find you're okay just to strip them. Alternatives are to come up with your own escaping, or encode the whole string as binary and base64 the result. Escaping will take a good deal less space than the re-encoding approach :)

like image 41
Jon Skeet Avatar answered Oct 21 '22 14:10

Jon Skeet