Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to retrieve all the contributors of a repo using github api

I'm trying to get the all contributors of a repo using this github api.

If I'm not wrong,it also tells me, if there are more than 500 contributors for a repo, it only gives 500 of them and rest are marked as anonymous.

For performance reasons, only the first 500 author email addresses in the repository will be linked to GitHub users.

This repo linux kernel has 5k+ contributors, as per the api i should get at least 500 contributors through the api.

When i do curl -I https://api.github.com/repos/torvalds/linux/contributors?per_page=100

I get only 3 pages (per_page = 100) so i get >300 contributors.(look at "link" header)

Is there a way to get all the contributors of the repo ( 5000+ )?

HTTP/1.1 200 OK
Server: GitHub.com
Date: Thu, 19 Nov 2015 18:00:54 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 100308
Status: 200 OK
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 56
X-RateLimit-Reset: 1447958881
Cache-Control: public, max-age=60, s-maxage=60
Last-Modified: Thu, 19 Nov 2015 16:06:38 GMT
ETag: "a57e0f74fc68e1791da15d33fa044616"
Vary: Accept
X-GitHub-Media-Type: github.v3
Link: <https://api.github.com/repositories/2325298/contributors?per_page=100&page=2>; rel="next", <https://api.github.com/repositories/2325298/contributors?per_page=100&page=3>; rel="last"
X-XSS-Protection: 1; mode=block
X-Frame-Options: deny
Content-Security-Policy: default-src 'none'
Access-Control-Allow-Credentials: true
Access-Control-Expose-Headers: ETag, Link, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval
Access-Control-Allow-Origin: *
Strict-Transport-Security: max-age=31536000; includeSubdomains; preload
X-Content-Type-Options: nosniff
Vary: Accept-Encoding
X-Served-By: a30e6f9aa7cf5731b87dfb3b9992202d
X-GitHub-Request-Id: 67E881D2:146C9:24CF1BB3:564E0E55
like image 484
simplyblue Avatar asked Nov 19 '15 18:11

simplyblue


Video Answer


1 Answers

Since the GitHub API doesn't seem to support this, another approach (a much much slower approach) would be to clone the repo and then run this command (to get names):

git log --all --format='%aN' | sort -u

To get results by email address (which should guard against contributor name config changes and will be more accurate):

git log --all --format='%aE' | sort -u

If you needed this functionality for any repo you could write a simple script that would take in the repository path, clone the repo, run the command, and then delete the downloaded repo.

In the meantime, you could contact GitHub in hopes they increase the priority in expanding/fixing their API.

like image 173
Jonathan.Brink Avatar answered Nov 15 '22 20:11

Jonathan.Brink