Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to GetBytes() in C# with UTF8 encoding with BOM?

I'm having a problem with UTF8 encoding in my asp.net mvc 2 application in C#. I'm trying let user download a simple text file from a string. I am trying to get bytes array with the following line:

var x = Encoding.UTF8.GetBytes(csvString);

but when I return it for download using:

return File(x, ..., ...);

I get a file which is without BOM so I don't get Croatian characters shown up correctly. This is because my bytes array does not include BOM after encoding. I triend inserting those bytes manually and then it shows up correctly, but that's not the best way to do it.

I also tried creating UTF8Encoding class instance and passing a boolean value (true) to its constructor to include BOM, but it doesn't work either.

Anyone has a solution? Thanks!

like image 872
Nebojsa Veron Avatar asked Dec 10 '10 23:12

Nebojsa Veron


People also ask

What is getBytes ()?

The method getBytes() encodes a String into a byte array using the platform's default charset if no argument is passed. We can pass a specific Charset to be used in the encoding process, either as a String object or a String object.

What is getBytes utf8?

The '8' signifies that it allocates 8-bit blocks to denote a character. The number of blocks needed to represent a character varies from 1 to 4. In order to convert a String into UTF-8, we use the getBytes() method in Java. The getBytes() method encodes a String into a sequence of bytes and returns a byte array.

Which of the following are the correct variant of getBytes method?

There are three variants of getBytes() method. The signature or syntax of string getBytes() method is given below: public byte[] getBytes() public byte[] getBytes(Charset charset)

How do you get bytes from string in Kotlin?

To convert a string to byte array in Kotlin, use String. toByteArray() method. String. toByteArray() method returns a Byte Array created using the characters of the calling string.


2 Answers

Try like this:

public ActionResult Download() {     var data = Encoding.UTF8.GetBytes("some data");     var result = Encoding.UTF8.GetPreamble().Concat(data).ToArray();     return File(result, "application/csv", "foo.csv"); } 

The reason is that the UTF8Encoding constructor that takes a boolean parameter doesn't do what you would expect:

byte[] bytes = new UTF8Encoding(true).GetBytes("a"); 

The resulting array would contain a single byte with the value of 97. There's no BOM because UTF8 doesn't require a BOM.

like image 104
Darin Dimitrov Avatar answered Sep 29 '22 00:09

Darin Dimitrov


I created a simple extension to convert any string in any encoding to its representation of byte array when it is written to a file or stream:

public static class StreamExtensions {     public static byte[] ToBytes(this string value, Encoding encoding)     {         using (var stream = new MemoryStream())         using (var sw = new StreamWriter(stream, encoding))         {             sw.Write(value);             sw.Flush();             return stream.ToArray();         }     } } 

Usage:

stringValue.ToBytes(Encoding.UTF8) 

This will work also for other encodings like UTF-16 which requires the BOM.

like image 43
Hovhannes Hakobyan Avatar answered Sep 28 '22 23:09

Hovhannes Hakobyan