I need to detect strings with the form @base64 (e.g. @VGhpcyBpcyBhbiBlbmNvZGVkIHN0cmluZw==
) in my application.
The @ has to be at the beginning and the charset for base64 encoded strings is a-z
, A-Z
, 0-9
, +
, /
and =
. Would be the appropiate regular expresion to detect them?
Thanks
A regular expression that validates base64 encoded data needs to check for the characters A to Z, a to z, 0 to 9, plus (+), and forward-slash (/) combined in a multiple of 4. If the number of characters is not an exact multiple of 4, the expression must search for the equal sign (=) as padding at the end.
In base64 encoding, the character set is [A-Z, a-z, 0-9, and + /] . If the rest length is less than 4, the string is padded with '=' characters. ^([A-Za-z0-9+/]{4})* means the string starts with 0 or more base64 groups.
Cg== is the base64 encode of the new line character in the latest position. So if you want to encode ABC you will get QUJD , however if you include a "return character" after ABC you will get QUJDCg== .
Something like this should do (does not check for proper length!):
^@[a-zA-Z0-9+/]+={,2}$
The length of any base64 encoded string must be a multiple of 4, hence the additional.
See here for a solution that checks against proper length: RegEx to parse or validate Base64 data
A quick explanation of the regex from the linked answer:
^@ #match "@" at beginning of string
(?:[A-Za-z0-9+/]{4})* #match any number of 4-letter blocks of the base64 char set
(?:
[A-Za-z0-9+/]{2}== #match 2-letter block of the base64 char set followed by "==", together forming a 4-letter block
| # or
[A-Za-z0-9+/]{3}= #match 3-letter block of the base64 char set followed by "=", together forming a 4-letter block
)?
$ #match end of string
try with:
^@(?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=)?$
=> RegEx to parse or validate Base64 data
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With