Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I force XDocument to output "UTF-8" in the declaration line?

The following code produces this output:

<?xml version="1.0" encoding="utf-16" standalone="yes"?>
<customers>
  <customer>
    <firstName>Jim</firstName>
    <lastName>Smith</lastName>
  </customer>
</customers>

How can I get it to produce encoding="utf-8" instead of encoding="utf-16"?

using System;
using System.Collections.Generic;
using System.IO;
using System.Xml.Linq;

namespace test_xml2
{
    class Program
    {
        static void Main(string[] args)
        {
            List<Customer> customers = new List<Customer> {
                new Customer {FirstName="Jim", LastName="Smith", Age=27},
                new Customer {FirstName="Hank", LastName="Moore", Age=28},
                new Customer {FirstName="Jay", LastName="Smythe", Age=44},
                new Customer {FirstName="Angie", LastName="Thompson", Age=25},
                new Customer {FirstName="Sarah", LastName="Conners", Age=66}
            };

            Console.WriteLine(BuildXmlWithLINQ(customers));

            Console.ReadLine();

        }
        private static string BuildXmlWithLINQ(List<Customer> customers)
        {
            XDocument xdoc =
                new XDocument(
                    new XDeclaration("1.0", "utf-8", "yes"),
                    new XElement("customers",
                        new XElement("customer",
                            new XElement("firstName", "Jim"),
                            new XElement("lastName", "Smith")
                        )
                    )
                );

            var wr = new StringWriter();
            xdoc.Save(wr);

            return wr.GetStringBuilder().ToString();
        }
    }

    public class Customer
    {
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public int Age { get; set; }

        public string Display()
        {
            return String.Format("{0}, {1} ({2})", LastName, FirstName, Age);
        }
    }
}
like image 714
Edward Tanguay Avatar asked Jul 20 '10 08:07

Edward Tanguay


2 Answers

Allow me answer my own question, this seems to work:

private static string BuildXmlWithLINQ()
{
    XDocument xdoc = new XDocument
    (
        new XDeclaration("1.0", "utf-8", "yes"),
        new XElement("customers",
            new XElement("customer",
                new XElement("firstName", "Jim"),
                new XElement("lastName", "Smith")
            )
        )
    );
    return xdoc.Declaration.ToString() + Environment.NewLine + xdoc.ToString();
}
like image 157
Edward Tanguay Avatar answered Oct 05 '22 09:10

Edward Tanguay


This is not a bug in .NET. This is due to you using StringWriter as the target for your XDocument. Since StringWriter internally uses UTF-16, the document must also use UTF-16 as encoding. If you save the XDoc to a stream or a file, it will use UTF-8 as instructed.

For more information, see MSDN information about StringWriter.Encoding:

This property is necessary for some XML scenarios where a header must be written containing the encoding used by the StringWriter. This allows the XML code to consume an arbitrary StringWriter and generate the correct XML header.

like image 21
Sami Kuhmonen Avatar answered Oct 05 '22 09:10

Sami Kuhmonen