Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Issue about 65533 � in C# text file reading

Tags:

c#

unicode

I created a sample app to load all special characters while copy pasting from Openoffice writer to Notepad. Double codes differs and when I try to load this.

var lines = File.ReadAllLines("..\\ter34.txt");

This creates problem of 65533 Issue comes and the text file contains:

This has been changed to the symbol:

like image 370
Aravind Srinivas Avatar asked Feb 22 '13 10:02

Aravind Srinivas


1 Answers

U+FFFD is the "Unicode replacement character", which is used if the data you try to read is invalid for the encoding which is being used to convert binary data to text.

For example, if you write a file out using ISO-8859-1, but then try to read it using UTF-8, then you could easily end up with some byte sequences which simply aren't valid UTF-8. Each invalid byte would be translated (by default) into U+FFFD.

Basically, you need to provide the right encoding to File.ReadAllLines, as a second argument. That means you need to know the encoding of the file first, of course.

like image 75
Jon Skeet Avatar answered Sep 20 '22 04:09

Jon Skeet