Is it possible to decode base64 encoded data to binary data at compile-time?
I think of something that looks like this:
constexpr auto decoded = decodeBase64<"SGVsbG8=">();
or
constexpr auto decoded = decodeBase64("SGVsbG8=");
I have no special requirements fo the resulting type of decoded
.
How fast can you decode base64 data? On a recent Intel processor, it takes roughly 2 cycles per byte (from cache) when using a fast decoder like the one from the Chrome browser.
Although Base64 is a relatively efficient way of encoding binary data it will, on average still increase the file size for more than 25%. This not only increases your bandwidth bill, but also increases the download time.
To decode a file with contents that are base64 encoded, you simply provide the path of the file with the --decode flag. As with encoding files, the output will be a very long string of the original file. You may want to output stdout directly to a file.
Base64 is an encoding and decoding technique used to convert binary data to an American Standard for Information Interchange (ASCII) text format, and vice versa.
I found it surprisingly hard to google for a constexpr base64 decoder, so I adapted the one here: https://gist.github.com/tomykaira/f0fd86b6c73063283afe550bc5d77594
Since that's MIT licensed, (sigh), be sure to slap this somewhere in the source file:
/**
* The MIT License (MIT)
* Copyright (c) 2016 tomykaira
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
* LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
* OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
* WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*/
To return a string from a constexpr
function, you need to return a char array. Because you can't return an array or std::string
, an std::array
is the best option. But there is a problem - due to a standards oversight, until C++17 the []
operator of std::array
is non-const. You can work around that by inheriting and adding a constructor though:
template <size_t N>
struct fixed_string : std::array<char, N> {
constexpr fixed_string(const char (&input)[N]) : fixed_string(input, std::make_index_sequence<N>{}) {}
template <size_t... Is>
constexpr fixed_string(const char (&input)[N], std::index_sequence<Is...>) : std::array<char, N>{ input[Is]... } {}
};
Change the decoder to use that instead of std::string
, and it seems to work as constexpr. Requires C++14 because C++11 constexpr functions can only have one return statement:
template <size_t N>
constexpr const std::array<char, ((((N-1) >> 2) * 3) + 1)> decode(const char(&input)[N]) {
constexpr unsigned char kDecodingTable[] = {
64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64,
64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64,
64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 62, 64, 64, 64, 63,
52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 64, 64, 64, 64, 64, 64,
64, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 64, 64, 64, 64, 64,
64, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 64, 64, 64, 64, 64,
64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64,
64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64,
64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64,
64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64,
64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64,
64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64,
64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64,
64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64
};
static_assert(((N-1) & 3) == 0, "Input data size is not a multiple of 4");
char out[(((N-1) >> 2) * 3) + 1] {0};
size_t out_len = (N-1) / 4 * 3;
if (input[(N-1) - 1] == '=') out_len--;
if (input[(N-1) - 2] == '=') out_len--;
for (size_t i = 0, j = 0; i < N-1;) {
uint32_t a = input[i] == '=' ? 0 & i++ : kDecodingTable[static_cast<int>(input[i++])];
uint32_t b = input[i] == '=' ? 0 & i++ : kDecodingTable[static_cast<int>(input[i++])];
uint32_t c = input[i] == '=' ? 0 & i++ : kDecodingTable[static_cast<int>(input[i++])];
uint32_t d = input[i] == '=' ? 0 & i++ : kDecodingTable[static_cast<int>(input[i++])];
uint32_t triple = (a << 3 * 6) + (b << 2 * 6) + (c << 1 * 6) + (d << 0 * 6);
if (j < out_len) out[j++] = (triple >> 2 * 8) & 0xFF;
if (j < out_len) out[j++] = (triple >> 1 * 8) & 0xFF;
if (j < out_len) out[j++] = (triple >> 0 * 8) & 0xFF;
}
return fixed_string<(((N-1) >> 2) * 3) + 1>(out);
}
Usage:
constexpr auto x = decode("aGVsbG8gd29ybGQ=");
/*...*/
printf(x.data()); // hello world
Demo: https://godbolt.org/z/HFdk6Z
updated to address helpful feedback from Marek R and Frank
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With