I have a SQL file added to my VS.NET 2008 project as an embedded resource. Whenever I use the following code to read the file's content, the string returned always starts with three junk characters and then the text I expect. I assume this has something to do with the Encoding.Default I am using, but that is just a guess. Why does this text keep showing up? Should I just trim off the first three characters or is there a more informed approach?
public string GetUpdateRestoreSchemaScript()
{
var type = GetType();
var a = Assembly.GetAssembly(type);
var script = "UpdateRestoreSchema.sql";
var resourceName = String.Concat(type.Namespace, ".", script);
using(Stream stream = a.GetManifestResourceStream(resourceName))
{
byte[] buffer = new byte[stream.Length];
stream.Read(buffer, 0, buffer.Length);
// UPDATE: Should be Encoding.UTF8
return Encoding.Default.GetString(buffer);
}
}
Update: I now know that my code works as expected if I simply change the last line to return a UTF-8 encoded string. It will always be true for this embedded file, but will it always be true? Is there a way to test any buffer to determine its encoding?
Probably the file is in utf-8 encoding and Encoding.Default is ASCII. Why don't you use a specific encoding?
Edit to answer a comment:
In order to guess the file encoding you could look for BOM at the start of the stream. If it exists, it helps, if not then you can only guess or ask user.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With