Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to decode BASE64 in Standard SQL?

I'm attempting to unhash a column of base64 values into decoded strings with Standard SQL in BigQuery and not having any luck so far.

I found there's a function called FROM_BASE64() but -

A. The documentation makes it appear as though it converts BASE64 into BYTES, which means I'd need an additional conversion.

B. FROM_BASE64() is not even working - the query runs, but the results are always the exact same encoded string that I inputted.

SELECT FROM_BASE64('aGVsbG8tc3RhY2tvdmVyZmxvdw==')

returns

"aGVsbG8tc3RhY2tvdmVyZmxvdw=="

What would be the best approach here?

like image 572
Brian Avatar asked Oct 25 '18 16:10

Brian


People also ask

How do I decode a Base64 encoded file?

To decode a file with contents that are base64 encoded, you simply provide the path of the file with the --decode flag. As with encoding files, the output will be a very long string of the original file. You may want to output stdout directly to a file.

Can Base64 encoding be decoded?

In JavaScript there are two functions respectively for decoding and encoding Base64 strings: btoa() : creates a Base64-encoded ASCII string from a "string" of binary data ("btoa" should be read as "binary to ASCII"). atob() : decodes a Base64-encoded string("atob" should be read as "ASCII to binary").


1 Answers

Base64 is a byte level encoding algorithm, so that is why the output of FROM_BASE64() is a BYTES.

As long as you don't need to display the decoded bytes, you can work with them in your queries as needed. But raw bytes might not be displayable as-is in BigQuery, so it might chose to display them in BASE64 instead. Thus, your example would simply decode the string literal to bytes and then re-encoding the result back to Base64, which would explain why you see your input string as the output.

You can cast a BYTES to a STRING, but that only works if the raw bytes represent a valid UTF-8 encoded string. Or, you can use SAFE_CONVERT_BYTES_TO_STRING() to decode a BYTES to a STRING replacing any non-valid UTF-8 bytes with Unicode codepoint U+FFFD.

Alternatively, you can use FORMAT() to display each byte in a BYTES in hex \x## format.

like image 200
Remy Lebeau Avatar answered Oct 11 '22 01:10

Remy Lebeau