I have a program that is generating Xml Files from data out of a database. In short code it does the following:
string dsn = "a db connection string";
XmlDocument d = new XmlDocument();
using (SqlConnection con = new SqlConnection(dsn)) {
con.Open();
string sql = "select id as Id, comment as Comment from Test where ... ";
using (SqlCommand cmd = new SqlCommand(sql, con)) {
DataSet ds = new DataSet("EXPORT");
SqlDataAdapter da = new SqlDataAdapter(cmd);
da.Fill(ds, "Test");
d.LoadXml(ds.GetXml());
}
}
d.Save(@"c:\test.xml");
When I have a look at the xml file it contains the invalid character & # x 1 A ;
<EXPORT>
<Test>
<Id>2</Id>
<Comment> Keyboard NB5 linked</Comment>
</Test>
</EXPORT>
This xml file cannot be opened by firefox browser saying invalid character ...
That Entity is reserved in ISO 8859-1 and CP1252 and should not be rendered by browsers. But why does XmlDocument output xml that cannot be parsed as valid - or is it a valid xml document that just cannot be parsed by Browsers or imported by Excel and so on ... Is there a easy way of getting rid of that reserved 'invalid characters' or encoding them in a way that Browsers do not have a Problem with it?
Many thanks for your opinion and tipps
Not all characters are representable in XML.
In XML 1.0, none of the characters with values less than 0x20 can be used, except for TAB (0x09), LF (0x0A) and CR (0x0D).
In XML 1.1, just about anything except NUL (0x00) can be used.
If you have the option to use XML 1.1, and the receiving program supports XML 1.1 (not many do), then you can escape the 0x1A as 
or 
.
Wrapping it in CDATA
is not a solution either; CDATA
is just a convenience for escaping groups of characters differently than the standard &-mechanism.
Otherwise, you will need to remove it prior to serializing.
I've run into this a few times when creating/manipulating XML from SQL data.
But why does XmlDocument output xml that cannot be parsed as valid - or is it a valid xml document that just cannot be parsed by Browsers or imported by Excel and so on
The XmlDocument doesn't perform any validation on the data that you send it, it leaves that to you (the developer). This XML document should be invalid in almost every thing that uses XML (but I could be wrong about that ... you could always test it :P)
Almost every time I've hit this problem, I ended up using replacing the offending XML data with either the proper character (if it has one) or just getting rid of it.
You could also try putting your xml inside a CData block, but that will bloat the file a tiny bit (not sure how big overall your file will be)
Take a look to this xml parse error on illegal character
Conclusion (as I understood it): With XML 1.0 it is impossible to store this value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With