Base64 only contains A–Z , a–z , 0–9 , + , / and = . So the list of characters not to be used is: all possible characters minus the ones mentioned above. For special purposes .
Q Does a base64 string always end with = ? Q Why does an = get appended at the end? A: As a short answer: The last character ( = sign) is added only as a complement (padding) in the final process of encoding a message with a special number of characters.
Base-64 maps 3 bytes (8 x 3 = 24 bits) in 4 characters that span 6-bits (6 x 4 = 24 bits). The result looks something like "TWFuIGlzIGRpc3Rpb...".
Use Convert.TryFromBase64String from C# 7.2
public static bool IsBase64String(string base64)
{
Span<byte> buffer = new Span<byte>(new byte[base64.Length]);
return Convert.TryFromBase64String(base64, buffer , out int bytesParsed);
}
Update: For newer versions of C#, there's a much better alternative, please refer to the answer by Tomas below.
It's pretty easy to recognize a Base64 string, as it will only be composed of characters 'A'..'Z', 'a'..'z', '0'..'9', '+', '/'
and it is often padded at the end with up to three '=', to make the length a multiple of 4. But instead of comparing these, you'd be better off ignoring the exception, if it occurs.
I know you said you didn't want to catch an exception. But, because catching an exception is more reliable, I will go ahead and post this answer.
public static bool IsBase64(this string base64String) {
// Credit: oybek https://stackoverflow.com/users/794764/oybek
if (string.IsNullOrEmpty(base64String) || base64String.Length % 4 != 0
|| base64String.Contains(" ") || base64String.Contains("\t") || base64String.Contains("\r") || base64String.Contains("\n"))
return false;
try{
Convert.FromBase64String(base64String);
return true;
}
catch(Exception exception){
// Handle the exception
}
return false;
}
Update: I've updated the condition thanks to oybek to further improve reliability.
I believe the regex should be:
Regex.IsMatch(s, @"^[a-zA-Z0-9\+/]*={0,2}$")
Only matching one or two trailing '=' signs, not three.
s
should be the string that will be checked. Regex
is part of the System.Text.RegularExpressions
namespace.
Just for the sake of completeness I want to provide some implementation. Generally speaking Regex is an expensive approach, especially if the string is large (which happens when transferring large files). The following approach tries the fastest ways of detection first.
public static class HelperExtensions {
// Characters that are used in base64 strings.
private static Char[] Base64Chars = new[] { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '+', '/' };
/// <summary>
/// Extension method to test whether the value is a base64 string
/// </summary>
/// <param name="value">Value to test</param>
/// <returns>Boolean value, true if the string is base64, otherwise false</returns>
public static Boolean IsBase64String(this String value) {
// The quickest test. If the value is null or is equal to 0 it is not base64
// Base64 string's length is always divisible by four, i.e. 8, 16, 20 etc.
// If it is not you can return false. Quite effective
// Further, if it meets the above criterias, then test for spaces.
// If it contains spaces, it is not base64
if (value == null || value.Length == 0 || value.Length % 4 != 0
|| value.Contains(' ') || value.Contains('\t') || value.Contains('\r') || value.Contains('\n'))
return false;
// 98% of all non base64 values are invalidated by this time.
var index = value.Length - 1;
// if there is padding step back
if (value[index] == '=')
index--;
// if there are two padding chars step back a second time
if (value[index] == '=')
index--;
// Now traverse over characters
// You should note that I'm not creating any copy of the existing strings,
// assuming that they may be quite large
for (var i = 0; i <= index; i++)
// If any of the character is not from the allowed list
if (!Base64Chars.Contains(value[i]))
// return false
return false;
// If we got here, then the value is a valid base64 string
return true;
}
}
EDIT
As suggested by Sam, you can also change the source code slightly. He provides a better performing approach for the last step of tests. The routine
private static Boolean IsInvalid(char value) {
var intValue = (Int32)value;
// 1 - 9
if (intValue >= 48 && intValue <= 57)
return false;
// A - Z
if (intValue >= 65 && intValue <= 90)
return false;
// a - z
if (intValue >= 97 && intValue <= 122)
return false;
// + or /
return intValue != 43 && intValue != 47;
}
can be used to replace if (!Base64Chars.Contains(value[i]))
line with if (IsInvalid(value[i]))
The complete source code with enhancements from Sam will look like this (removed comments for clarity)
public static class HelperExtensions {
public static Boolean IsBase64String(this String value) {
if (value == null || value.Length == 0 || value.Length % 4 != 0
|| value.Contains(' ') || value.Contains('\t') || value.Contains('\r') || value.Contains('\n'))
return false;
var index = value.Length - 1;
if (value[index] == '=')
index--;
if (value[index] == '=')
index--;
for (var i = 0; i <= index; i++)
if (IsInvalid(value[i]))
return false;
return true;
}
// Make it private as there is the name makes no sense for an outside caller
private static Boolean IsInvalid(char value) {
var intValue = (Int32)value;
if (intValue >= 48 && intValue <= 57)
return false;
if (intValue >= 65 && intValue <= 90)
return false;
if (intValue >= 97 && intValue <= 122)
return false;
return intValue != 43 && intValue != 47;
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With