Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert .net String object into base64 encoded string

I have a question, which Unicode encoding to use while encoding .NET string into base64? I know strings are UTF-16 encoded on Windows, so is my way of encoding is the right one?

public static String ToBase64String(this String source) {         return Convert.ToBase64String(Encoding.Unicode.GetBytes(source));     } 
like image 890
chester89 Avatar asked Apr 13 '10 18:04

chester89


People also ask

How do you convert a string to Base64?

To convert a string into a Base64 character the following steps should be followed: Get the ASCII value of each character in the string. Compute the 8-bit binary equivalent of the ASCII values. Convert the 8-bit characters chunk into chunks of 6 bits by re-grouping the digits.

What is Base64 string in C#?

Base64 is a group of similar binary-to-text encoding schemes representing binary data in an ASCII string format by translating it into a radix-64 representation. Each Base64 digit represents exactly 6-bits of data that means 4 6-bit Base64 digits can represent 3 bytes.

What is ToBase64String?

ToBase64String(Byte[]) Converts an array of 8-bit unsigned integers to its equivalent string representation that is encoded with base-64 digits. ToBase64String(Byte[], Base64FormattingOptions) Converts an array of 8-bit unsigned integers to its equivalent string representation that is encoded with base-64 digits.

Why does Base64 strings end with ==?

Q Why does an = get appended at the end? A: As a short answer: The last character ( = sign) is added only as a complement (padding) in the final process of encoding a message with a special number of characters.


2 Answers

What you've provided is perfectly functional. It will produce a base64-encoded string of the bytes of your source string encoded in UTF-16.

If you're asking if UTF-16 can represent any character in your string, then yes. The only difference between UTF-16 and UTF-32 is that UTF-16 is a variable-length encoding; it uses two-bytes to represent characters within a subset, and four-bytes for all other characters.

There are no unicode characters that cannot be represented by UTF-16.

like image 50
Adam Robinson Avatar answered Sep 30 '22 23:09

Adam Robinson


Be aware that you don't have to use UTF-16 just because that's what .NET strings use. When you create that byte array, you're free to choose any encoding that will handle all the characters in your string. For example, UTF-8 would be more efficient if the text is in a Latin-based language, but it can still handle every known character.

The most important concern is that whatever software decodes the base64 string, needs to know which encoding to apply to the byte array to re-create the original string.

like image 39
Alan Moore Avatar answered Sep 30 '22 23:09

Alan Moore