Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Validate string is base64 format using RegEx?

I have been looking how to validate a base64 string and came across this.

 ^(?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=)?$

I need a little help to make it allow "==" aswell as "=".

Thanks

like image 822
arbme Avatar asked Jul 28 '10 17:07

arbme


People also ask

How do you check whether a string is base64 encoded or not?

This works in Python: import base64 def IsBase64(str): try: base64. b64decode(str) return True except Exception as e: return False if IsBase64("ABC"): print("ABC is Base64-encoded and its result after decoding is: " + str(base64.

What is the regex for base64 string?

A regular expression that validates base64 encoded data needs to check for the characters A to Z, a to z, 0 to 9, plus (+), and forward-slash (/) combined in a multiple of 4. If the number of characters is not an exact multiple of 4, the expression must search for the equal sign (=) as padding at the end.

How do you check whether the string is base64 encoded or not in JavaScript?

To determine if a string is a base64 string using JavaScript, we can check if a base64 string against a regex. For instance, we can write: const base64regex = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?

How do I get base64 encoded strings?

If we were to Base64 encode a string we would follow these steps: Take the ASCII value of each character in the string. Calculate the 8-bit binary equivalent of the ASCII values. Convert the 8-bit chunks into chunks of 6 bits by simply re-grouping the digits.


2 Answers

This should perform extremely well.

private static readonly HashSet<char> _base64Characters = new HashSet<char>() { 
    'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 
    'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 
    'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 
    'w', 'x', 'y', 'z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '+', '/', 
    '='
};

public static bool IsBase64String(string value)
{
    if (string.IsNullOrEmpty(value))
    {
        return false;
    }
    else if (value.Any(c => !_base64Characters.Contains(c)))
    {
        return false;
    }

    try
    {
        Convert.FromBase64String(value);
        return true;
    }
    catch (FormatException)
    {
        return false;
    }
}
like image 133
ChaosPandion Avatar answered Sep 22 '22 03:09

ChaosPandion


I've updated the above code a bit to meet few more requirements:

  • check for correct string size (should be multiple of 4)
  • check for pad character count (should be up to 2 character at the end of the string only)
  • make it work in .NET 2.0 (well, the HashSet<T> should be implemented or use Dictionary<T, U>)

The code is a part of my assertion library, so this is why there are two check methods and the param parameter...

    private const char Base64Padding = '=';

    private static readonly HashSet<char> Base64Characters = new HashSet<char>()
    { 
        'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 
        'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 
        'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 
        'w', 'x', 'y', 'z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '+', '/'
    };

    public static void CheckBase64String(string param, string paramName)
    {
        if (CheckBase64StringSafe(param) == false)
        {
            throw (new ArgumentException(String.Format("Parameter '{0}' is not a valid Base64 string.", paramName)));
        }
    }

    public static bool CheckBase64StringSafe(string param)
    {
        if (param == null)
        {
            // null string is not Base64 something
            return false;
        }

        // replace optional CR and LF characters
        param = param.Replace("\r", String.Empty).Replace("\n", String.Empty);

        if (param.Length == 0 ||
            (param.Length % 4) != 0)
        {
            // Base64 string should not be empty
            // Base64 string length should be multiple of 4
            return false;
        }

        // replace pad chacters
        int lengthNoPadding = param.Length;
        int lengthPadding;

        param = param.TrimEnd(Base64Padding);
        lengthPadding = param.Length;

        if ((lengthNoPadding - lengthPadding) > 2)
        {
            // there should be no more than 2 pad characters
            return false;
        }

        foreach (char c in param)
        {
            if (Base64Characters.Contains(c) == false)
            {
                // string contains non-Base64 character
                return false;
            }
        }

        // nothing invalid found
        return true;
    }

I've not tested the code extensively, so there no functionality guarantees at all!

like image 31
Libor Avatar answered Sep 22 '22 03:09

Libor