I am using the GitHub API to download a file from GitHub. I have been able to successfully authenticate as well as get a response from github, and see a base64 encoded string representing the file contents.
Unfortunately, I get an unusual error (string length is not a multiple of 4) when decoding the base64 string.
The HTTP request is illustrated below:
GET /repos/:owner/:repo/contents/:path
The (partial) response is illustrated below:
{
    "name":....,
    "download_url":...",
    "type":"file",
    "content":"ewogICAgInN3YWdnZXIiOiAiM...
}
The issue I am encountering is that the length of the string is 15263 bytes, and I get an error in decoding the string (string length is not a multiple of 4). I am using node.js and the 'base64-js' npm module to decode the string. Code to execute the decoding is illustrated below:
var base64 = require('base64-js');
var contents = base64.toByteArray(fileContent);
The decoding causes an exception:
Error: Invalid string. Length must be a multiple of 4
    at placeHoldersCount (.../node_modules/base64-js/index.js:23:11)
    at Object.toByteArray (...node_modules/base64-js/index.js:42:18)
    :
    :
I would think that the GitHub API is sending me the correct data, so I figure that is not the issue.
Am I performing the decoding improperly or is there another problem I am overlooking?
Any help is appreciated.
I experimented a bit and found a solution by using a different base64 decoding library as follows:
var base64 = require('js-base64').Base64;
var contents = base64.decode(res.content);
I am not sure if it is mandatory to have an encoded string length divisible by 4 (clearly my 15263 character length string is not divisible by 4) but the alternate library decoded the string properly.
A second solution which I also found to work is specific to how to use the GitHub API. By adding the following to the GitHub API call header, I was also able to get the decoded file contents:
'accept': 'application/vnd.github.VERSION.raw'
After much experimenting, I think I nailed down the difference between the working and broken base64 decoding.
It appears GitHub Base-64 encodes with:
As opposed to a "basic" (RFC4648) Base64 encoder. Several languages seem to default to the basic encoder (including Java, which I was using). When I switched to a MIME encoder, I got the full contents of the file un-garbled. This would explain why switching libraries in some cases fixed the issue.
I will note the contents field contained newline characters - decoders are supposed to ignore them, but not all do, so if you still get errors, you may need to try removing them.
The media-type header will do the job better, however in my case I am trying to use the API via a GitHub App - at time of writing, GitHub requires a specific media type be used when doing that, and it returns the JSON response.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With