If you want to remove the byte order mark from a source code, you need a text editor that offers the option of saving the mark. You read the file with the BOM into the software, then save it again without the BOM and thereby convert the coding. The mark should then no longer appear.
The UTF-8 encoding without a BOM has the property that a document which contains only characters from the US-ASCII range is encoded byte-for-byte the same way as the same document encoded using the US-ASCII encoding. Such a document can be processed and understood when encoded either as UTF-8 or as US-ASCII.
UTF-8 has the same byte order regardless of platform endianness, so a byte order mark isn't needed. However, it may occur (as the byte sequence EF BB FF ) in data that was converted to UTF-8 from UTF-16, or as a "signature" to indicate that the data is UTF-8.
In order to omit the byte order mark (BOM), your stream must use an instance of UTF8Encoding
other than System.Text.Encoding.UTF8
(which is configured to generate a BOM). There are two easy ways to do this:
1. Explicitly specifying a suitable encoding:
Call the UTF8Encoding
constructor with False
for the encoderShouldEmitUTF8Identifier
parameter.
Pass the UTF8Encoding
instance to the stream constructor.
' VB.NET:
Dim utf8WithoutBom As New System.Text.UTF8Encoding(False)
Using sink As New StreamWriter("Foobar.txt", False, utf8WithoutBom)
sink.WriteLine("...")
End Using
// C#:
var utf8WithoutBom = new System.Text.UTF8Encoding(false);
using (var sink = new StreamWriter("Foobar.txt", false, utf8WithoutBom))
{
sink.WriteLine("...");
}
2. Using the default encoding:
If you do not supply an Encoding
to StreamWriter
's constructor at all, StreamWriter
will by default use an UTF8 encoding without BOM, so the following should work just as well:
' VB.NET:
Using sink As New StreamWriter("Foobar.txt")
sink.WriteLine("...")
End Using
// C#:
using (var sink = new StreamWriter("Foobar.txt"))
{
sink.WriteLine("...");
}
Finally, note that omitting the BOM is only permissible for UTF-8, not for UTF-16.
Try this:
Encoding outputEnc = new UTF8Encoding(false); // create encoding with no BOM
TextWriter file = new StreamWriter(filePath, false, outputEnc); // open file with encoding
// write data here
file.Close(); // save and close it
Just Simply use the method WriteAllText
from System.IO.File
.
Please check the sample from File.WriteAllText.
This method uses UTF-8 encoding without a Byte-Order Mark (BOM), so using the GetPreamble method will return an empty byte array. If it is necessary to include a UTF-8 identifier, such as a byte order mark, at the beginning of a file, use the WriteAllText(String, String, Encoding) method overload with UTF8 encoding.
Interesting note with respect to this: strangely, the static "CreateText()" method of the System.IO.File class creates UTF-8 files without BOM.
In general this the source of bugs, but in your case it could have been the simplest workaround :)
If you do not specify an Encoding
when creating a new StreamWriter
the default Encoding
object used is UTF-8 No BOM
which is created via new UTF8Encoding(false, true)
.
So to create a text file without the BOM use of of the constructors that do not require you to provide an encoding:
new StreamWriter(Stream)
new StreamWriter(String)
new StreamWriter(String, Boolean)
I think Roman Nikitin is right. The meaning of the constructor argument is flipped. False means no BOM and true means with BOM.
You get an ANSI encoding because a file without a BOM that does not contain non-ansi characters is exactly the same as an ANSI file. Try some special characters in you "hi there" string and you'll see the ANSI encoding change to without-BOM.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With