Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strange Base64 encode/decode problem

Tags:

I'm using Grails 1.3.7. I have some code that uses the built-in base64Encode function and base64Decode function. It all works fine in simple test cases where I encode some binary data and then decode the resulting string and write it to a new file. In this case the files are identical.

But then I wrote a web service that took the base64 encoded data as a parameter in a POST call. Although the length of the base64 data is identical to the string I passed into the function, the contents of the base64 data are being modified. I spend DAYS debugging this and finally wrote a test controller that passed the data in base64 to post and also took the name of a local file with the correct base64 encoded data, as in:

data=AAA-base-64-data...&testFilename=/name/of/file/with/base64data

Within the test function I compared every byte in the incoming data parameter with the appropriate byte in the test file. I found that somehow every "+" character in the input data parameter had been replaced with a " " (space, ordinal ascii 32). Huh? What could have done that?

To be sure I was correct, I added a line that said:

data = data.replaceAll(' ', '+')

and sure enough the data decoded exactly right. I tried it with arbitrarily long binary files and it now works every time. But I can't figure out for the life of me what would be modifying the data parameter in the post to convert the ord(43) character to ord(32)? I know that the plus sign is one of the 2 somewhat platform dependent characters in the base64 spec, but given that I am doing the encoding and decoding on the same machine for now I am super puzzled what caused this. Sure I have a "fix" since I can make it work, but I am nervous about "fixes" that I don't understand.

The code is too big to post here, but I get the base64 encoding like so:

def inputFile = new File(inputFilename)
def rawData =  inputFile.getBytes()
def encoded = rawData.encodeBase64().toString()

I then write that encoded string out to new a file so I can use it for testing later. If I load that file back in as so I get the same rawData:

def encodedFile = new File(encodedFilename)
String encoded = encodedFile.getText()
byte[] rawData = encoded.decodeBase64()

So all that is good. Now assume I take the "encoded" variable and add it to a param to a POST function like so:

String queryString = "data=$encoded"
String url = "http://localhost:8080/some_web_service"

def results = urlPost(url, queryString)

def urlPost(String urlString, String queryString) {
    def url = new URL(urlString)
    def connection = url.openConnection()
    connection.setRequestMethod("POST")
    connection.doOutput = true

    def writer = new OutputStreamWriter(connection.outputStream)
    writer.write(queryString)
    writer.flush()
    writer.close()
    connection.connect()

    return (connection.responseCode == 200) ? connection.content.text : "error                         $connection.responseCode, $connection.responseMessage"
}

on the web service side, in the controller I get the parameter like so:

String data = params?.data
println "incoming data parameter has length of ${data.size()}" //confirm right size

//unless I run the following line, the data does not decode to the same source
data = data.replaceAll(' ', '+')

//as long as I replace spaces with plus, this decodes correctly, why?
byte[] bytedata = data.decodeBase64()

Sorry for the long rant, but I'd really love to understand why I had to do the "replace space with plus sign" to get this to decode correctly. Is there some problem with the plus sign being used in a request parameter?

like image 970
Rich Sadowsky Avatar asked Apr 11 '11 23:04

Rich Sadowsky


People also ask

What is the problem that Base64 encoding method design to solve?

Base64 encoding schemes are commonly used when there is a need to encode binary data that needs be stored and transferred over media that are designed to deal with textual data. This is to ensure that the data remains intact without modification during transport.

How do I decode a Base64 encoded file?

To decode a file with contents that are base64 encoded, you simply provide the path of the file with the --decode flag. As with encoding files, the output will be a very long string of the original file. You may want to output stdout directly to a file.

Why do Base64 strings end with ==?

Q Why does an = get appended at the end? A: As a short answer: The last character ( = sign) is added only as a complement (padding) in the final process of encoding a message with a special number of characters.

Is it possible to decode Base64?

Base64 is an encoding, the strings you've posted are encoded. You can DECODE the base64 values into bytes (so just a sequence of bits). And from there, you need to know what these bytes represent and what original encoding they were represented in, if you wish to convert them again to a legible format.


2 Answers

Whatever populates params expects the request to be a URL-encoded form (specifically, application/x-www-form-urlencoded, where "+" means space), but you didn't URL-encode it. I don't know what functions your language provides, but in pseudo code, queryString should be constructed from

concat(uri_escape("data"), "=", uri_escape(base64_encode(rawBytes)))

which simplifies to

concat("data=", uri_escape(base64_encode(rawBytes)))

The "+" characters will be replaced with "%2B".

like image 61
ikegami Avatar answered Oct 17 '22 07:10

ikegami


You have to use a special base64encode which is also url-safe. The problem is that standard base64encode includes +, / and = characters which are replaced by the percent-encoded version.

http://en.wikipedia.org/wiki/Base64#URL_applications

I'm using the following code in php:

    /**
     * Custom base64 encoding. Replace unsafe url chars
     *
     * @param string $val
     * @return string
     */
    static function base64_url_encode($val) {

        return strtr(base64_encode($val), '+/=', '-_,');

    }

    /**
     * Custom base64 decode. Replace custom url safe values with normal
     * base64 characters before decoding.
     *
     * @param string $val
     * @return string
     */
    static function base64_url_decode($val) {

        return base64_decode(strtr($val, '-_,', '+/='));

    }
like image 24
Polak Avatar answered Oct 17 '22 06:10

Polak