Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Writing XML files using XmlTextWriter with ISO-8859-1 encoding

I'm having a problem writing Norwegian characters into an XML file using C#. I have a string variable containing some Norwegian text (with letters like æøå).

I'm writing the XML using an XmlTextWriter, writing the contents to a MemoryStream like this:

MemoryStream stream = new MemoryStream();
XmlTextWriter xmlTextWriter = new XmlTextWriter(stream, Encoding.GetEncoding("ISO-8859-1"));
xmlTextWriter.Formatting = Formatting.Indented;
xmlTextWriter.WriteStartDocument(); //Start doc

Then I add my Norwegian text like this:

xmlTextWriter.WriteCData(myNorwegianText);

Then I write the file to disk like this:

FileStream myFile = new FileStream(myPath, FileMode.Create);
StreamWriter sw = new StreamWriter(myFile);

stream.Position = 0;
StreamReader sr = new StreamReader(stream);
string content = sr.ReadToEnd();

sw.Write(content);
sw.Flush();

myFile.Flush();
myFile.Close();

Now the problem is that in the file on this, all the Norwegian characters look funny.

I'm probably doing the above in some stupid way. Any suggestions on how to fix it?

like image 316
henningst Avatar asked Sep 26 '08 12:09

henningst


3 Answers

Why are you writing the XML first to a MemoryStream and then writing that to the actual file stream? That's pretty inefficient. If you write directly to the FileStream it should work.

If you still want to do the double write, for whatever reason, do one of two things. Either

  1. Make sure that the StreamReader and StreamWriter objects you use all use the same encoding as the one you used with the XmlWriter (not just the StreamWriter, like someone else suggested), or

  2. Don't use StreamReader/StreamWriter. Instead just copy the stream at the byte level using a simple byte[] and Stream.Read/Write. This is going to be, btw, a lot more efficient anyway.

like image 167
tomasr Avatar answered Nov 18 '22 07:11

tomasr


Both your StreamWriter and your StreamReader are using UTF-8, because you're not specifying the encoding. That's why things are getting corrupted.

As tomasr said, using a FileStream to start with would be simpler - but also MemoryStream has the handy "WriteTo" method which lets you copy it to a FileStream very easily.

I hope you've got a using statement in your real code, by the way - you don't want to leave your file handle open if something goes wrong while you're writing to it.

Jon

like image 13
Jon Skeet Avatar answered Nov 18 '22 06:11

Jon Skeet


You need to set the encoding everytime you write a string or read binary data as a string.

    Encoding encoding = Encoding.GetEncoding("ISO-8859-1");

    FileStream myFile = new FileStream(myPath, FileMode.Create);
    StreamWriter sw = new StreamWriter(myFile, encoding);

    stream.Position = 0;
    StreamReader sr = new StreamReader(stream, encoding);
    string content = sr.ReadToEnd();

    sw.Write(content);
    sw.Flush();

    myFile.Flush();
    myFile.Close();
like image 8
Thomas Danecker Avatar answered Nov 18 '22 07:11

Thomas Danecker