Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Serializing an object to a string: why is my encoding adding stupid characters?

I need to get the serialized XML representation of an object as a string. I'm using the XmlSerializer and a memoryStream to do this.

XmlSerializer serializer = new XmlSerializer(typeof(MyClass));
using (MemoryStream stream = new MemoryStream())
{
  using (XmlTextWriter writer = new XmlTextWriter(stream,Encoding.UTF8))
  {
    serializer.Serialize(writer, myClass);
    string xml = Encoding.UTF8.GetString(stream.ToArray());
    //other chars may be added from the encoding.
    xml = xml.Substring(xml.IndexOf(Convert.ToChar(60)));
    xml = xml.Substring(0, (xml.LastIndexOf(Convert.ToChar(62)) + 1));
    return xml;
  }
}

Now just take note of the xml.substring lines for a moment. What I'm finding is that (even thought I'm specifying encoding on the XmlTextWriter and on the GetString (and I'm using memoryStream.ToArray(), so I'm operating only on the data in the stream's buffer)... the resulting xml string has some non-xml happy character added. In my case, a '?' at the start of the string. This is why I'm substring-ing for '<' and '>' to ensure I've only getting the good stuff.

Strange thing is, looking at this string in the debugger (Text Visualizer), I don't see this '?'. Only when I paste what's in the visualizer into notepad or similar.

So while the above code (substring etc) does the job, what's actually happening here? Is some unsigned byte thing being included and not being represented in the Text Visualizer?

like image 902
MoSlo Avatar asked Dec 09 '22 07:12

MoSlo


1 Answers

You can exclude the BOM by specifying the encoding specifically - i.e. instead of Encoding.UTF8, try using:

using (MemoryStream stream = new MemoryStream())
{
  var enc = new UTF8Encoding(false);
  using (XmlTextWriter writer = new XmlTextWriter(stream,enc))
  {
    serializer.Serialize(writer, myClass);        
  }
  string xml = Encoding.UTF8.GetString(
      stream.GetBuffer(), 0, (int)stream.Length);
}
like image 83
Marc Gravell Avatar answered Apr 17 '23 12:04

Marc Gravell