Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I detect if a .NET StreamReader found a UTF8 BOM on the underlying stream?

I get a FileStream(filename,FileMode.Open,FileAccess.Read,FileShare.ReadWrite) and then a StreamReader(stream,true).

Is there a way I can check if the stream started with a UTF8 BOM? I am noticing that files without the BOM are read as UTF8 by the StreamReader.

How can I tell them apart?

like image 726
bookclub Avatar asked Feb 16 '11 03:02

bookclub


1 Answers

Rather than hardcoding the bytes, it is prettier to use the API

public string ConvertFromUtf8(byte[] bytes)
{
  var enc = new UTF8Encoding(true);
  var preamble = enc.GetPreamble();
  if (preamble.Where((p, i) => p != bytes[i]).Any()) 
    throw new ArgumentException("Not utf8-BOM");
  return enc.GetString(bytes.Skip(preamble.Length).ToArray());
}
like image 96
Carlo V. Dango Avatar answered Nov 15 '22 22:11

Carlo V. Dango