Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sending a string containing special characters through a TcpClient (byte[])

I'm trying to send a string containing special characters through a TcpClient (byte[]). Here's an example:

  • Client enters "amé" in a textbox
  • Client converts string to byte[] using a certain encoding (I've tried all the predefined ones plus some like "iso-8859-1")
  • Client sends byte[] through TCP
  • Server receives and outputs the string reconverted with the same encoding (to a listbox)

Edit :

I forgot to mention that the resulting string was "am?".

Edit-2 (as requested, here's some code):

@DJKRAZE here's a bit of code :

byte[] buffer = Encoding.ASCII.GetBytes("amé");
(TcpClient)server.Client.Send(buffer);

On the server side:

byte[] buffer = new byte[1024];
Client.Recieve(buffer);
string message = Encoding.ASCII.GetString(buffer);
ListBox1.Items.Add(message);

The string that appears in the listbox is "am?"

=== Solution ===

Encoding encoding = Encoding.GetEncoding("iso-8859-1");
byte[] message = encoding.GetBytes("babé");

Update:

Simply using Encoding.Utf8.GetBytes("ééé"); works like a charm.

like image 259
Philippe Paré Avatar asked Feb 26 '13 05:02

Philippe Paré


2 Answers

Never too late to answer a question I think, hope someone will find answers here.

C# uses 16 bit chars, and ASCII truncates them to 8 bit, to fit in a byte. After some research, I found UTF-8 to be the best encoding for special characters.

//data to send via TCP or any stream/file
byte[] string_to_send = UTF8Encoding.UTF8.GetBytes("amé");

//when receiving, pass the array in this to get the string back
string received_string = UTF8Encoding.UTF8.GetString(message_to_send);
like image 80
Philippe Paré Avatar answered Nov 12 '22 23:11

Philippe Paré


Your problem appears to be the Encoding.ASCII.GetBytes("amé"); and Encoding.ASCII.GetString(buffer); calls, as hinted at by '500 - Internal Server Error' in his comments.

The é character is a multi-byte character which is encoded in UTF-8 with the byte sequence C3 A9. When you use the Encoding.ASCII class to encode and decode, the é character is converted to a question mark since it does not have a direct ASCII encoding. This is true of any character that has no direct coding in ASCII.

Change your code to use Encoding.UTF8.GetBytes() and Encoding.UTF8.GetString() and it should work for you.

like image 27
Corey Avatar answered Nov 13 '22 00:11

Corey