Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

why is ASCII.GetBytes returning wrong bytes

Im creating a class and converting it into xml.

The problem is that when i convert the class xml string into bytes
the ASCII.GetBytes return a byte array with
an extra character in the beginning of the ascArray

It's always a ? character so the xml starts like this

?<?xml version="1.0" encoding="utf-8"?>

Why is this happening?

This is the code:

  WorkItem p = new WorkItem();

  // Fill the class with whatever need to be sent to client
  OneItem posts1 = new OneItem();
  posts1.id = "id 1";
  posts1.username = "hasse";
  posts1.message = "hej again";
  posts1.time = "time1";
  p.setPost(posts1);

  OneItem posts2 = new OneItem();
  posts2.id = "id 2";
  posts2.username = "bella";
  posts2.message = "hej again again";
  posts2.time = "time2";
  p.setPost(posts2);

  // convert the class WorkItem to xml
  MemoryStream memoryStream = new MemoryStream();
  XmlSerializer xs = new XmlSerializer(typeof(WorkItem));
  XmlTextWriter xmlTextWriter = new XmlTextWriter(memoryStream, Encoding.UTF8);
  xs.Serialize(xmlTextWriter, p);

  // send the xml version of WorkItem to client
  byte[] data = memoryStream.ToArray();
  clientStream.Write(data, 0, data.Length);
  Console.WriteLine(" send.." + data);
  clientStream.Close();
like image 686
Erik Avatar asked Dec 29 '25 00:12

Erik


1 Answers

I strongly suspect that the data starts with a byte order mark, which can't be represented in ASCII.

It's not clear why you're doing what you're doing in the first place, particularly around the MemoryStream. Why are you creating a UTF-8 encoded byte array, then decoding that to a string (and we don't know what UTF8ByteArrayToString does), then converting it back to a byte array? Why not just write the byte array straight to the client to start with? If you need the data as a string, I'd use a subclass of StringWriter which advertises that it uses UTF-8 as the encoding. If you don't need it as a string, just stick to the byte array.

Note that even aside from this first character, the fact that you've got an XML document encoded in UTF-8 means there may well be other non-ASCII characters in the string. Why are you using ASCII at all here?

EDIT: Just to be clear, you're fundamentally applying a lossy transformation, and doing it needlessly. Even if you want a local copy of the data, you should have something like this:

// Removed bad try/catch block - don't just catch Exception, and don't
// just swallow exceptions
MemoryStream memoryStream = new MemoryStream();
XmlSerializer xs = new XmlSerializer(typeof(WorkItem));
XmlTextWriter xmlTextWriter = new XmlTextWriter(memoryStream, Encoding.UTF8);
xs.Serialize(xmlTextWriter, p);

// Removed pointless conversion to/from string
// Removed pointless BinaryWriter (just use the stream)

// An alternative would be memoryStream.WriteTo(clientStream);
byte[] data = memoryStream.ToArray();
clientStream.Write(data, 0, data.Length);
Console.WriteLine(" send.." + data);

// Removed Close calls - you should use "using" statements to dispose of
// streams automatically.
like image 55
Jon Skeet Avatar answered Dec 31 '25 14:12

Jon Skeet



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!