Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculate the size to a Base 64 decoded message

Tags:

c++

c

base64

I have a BASE64 encode string:

static const unsigned char base64_test_enc[] =
    "VGVzdCBzdHJpbmcgZm9yIGEgc3RhY2tvdmVyZmxvdy5jb20gcXVlc3Rpb24=";

It does not have CRLF-per-72 characters.

How to calculate a decoded message length?

like image 903
Mindaugas Jaraminas Avatar asked Dec 31 '15 12:12

Mindaugas Jaraminas


2 Answers

Well, base64 represents 3 bytes in 4 characters... so to start with you just need to divide by 4 and multiply by 3.

You then need to account for padding:

  • If the text ends with "==" you need to subtract 2 bytes (as the last group of 4 characters only represents 1 byte)
  • If the text ends with just "=" you need to subtract 1 byte (as the last group of 4 characters represents 2 bytes)
  • If the text doesn't end with padding at all, you don't need to subtract anything (as the last group of 4 characters represents 3 bytes as normal)
like image 155
Jon Skeet Avatar answered Oct 15 '22 07:10

Jon Skeet


Base 64 uses 4 characters per 3 bytes. If it uses padding it always has a multiple of 4 characters.

Furthermore, there are three padding possibilities:

  • two characters and two padding characters == for one encoded byte
  • 3 characters and one padding character = for two encoded bytes
  • and of course no padding characters, making 3 bytes.

So you can simply divide the number of characters by 4, then multiply by 3 and finally subtract the number of padding characters.


Possible C code could be (if I wasn't extremely rusty in C, please adjust):

size_t encoded_base64_bytes(const char *input)
{
    size_t len, padlen;
    char *last, *first_pad;

    len = strlen(input);

    if (len == 0) return 0;

    last = input + len - 4;
    first_pad = strchr(last, '=');
    padlen = first_pad == null ? 0 : last - first_pad;
    return (len / 4) * 3 - padlen;
}

Note that this code assumes that the input is valid base 64.


A good observer will notice that there are spare bits, usually set to 0 in the final characters if padding is used.

like image 36
Maarten Bodewes Avatar answered Oct 15 '22 07:10

Maarten Bodewes