Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting all versions of a file using GitHub blob api

I want to know how to get all commits/versions of a file (i.e. the contents of the commits/versions) via the GitHub API.
I figured one way to do it which is equivalent to the answer to this other question.

The problem is this uses the "contents" API, which has an upper limit of 1 MB per file (you get this error message if you try to access a file larger than 1 MB: "This API returns blobs up to 1 MB in size. The requested blob is too large to fetch via the API, but you can use the Git Data API to request blobs up to 100 MB in size.")

So to get files larger than 1 MB (up to 100 MB) you need to use the "blob" API, but I don't know how to use it in the same way as the contents API.

I.e., given a specific commit of a file, how do you get the contents of that file using the "blob" API?

like image 568
buddyroo30 Avatar asked Dec 24 '15 17:12

buddyroo30


1 Answers

The get content API indeed allows to pass a SHA1:

GET https://api.github.com/repos/:owner/:repo/contents/:FILE_PATH?ref=SHA

Note: The GitHub Content API now (May 2022) support up to 100MB files.

The Blob API also uses a SHA1:

GET /repos/:owner/:repo/git/blobs/:sha

But you need to get the SHA1 of the file you want first.

See "How do I get the “sha” parameter from GitHub API without downloading the whole file?", using the Get Tree API for the parent folder.

GET /repos/<owner>/<repo>/git/trees/url_encode(<branch_name>:<parent_path>)

'url_encode(<branch_name>:<parent_path>)' means the <branch_name>:<parent_path> needs to be url encoded

The result from the tree will give you the SHA1 of the file you are looking for.

The OP buddyroo30 mentions in the comments:

I ended up doing similarly using the tree API.
Specifically, I get all the commits for a file. Then I try to use the contents API to get the file contents for each commit.
If that fails (i.e. over 1 MB in size so I need to use the blob API), I get the tree URL for the file from its commit (i.e. in Perl: $commit_tree_url = $commit_info->{'commit'}->{'tree'}->{'url'}).
then I fetch $commit_tree_url and find the correct tree record in the results for the file --- this will have a 'url' hash value which can be used to get the file contents via the blob API.

like image 198
VonC Avatar answered Oct 15 '22 04:10

VonC