I am developing a web application which needs to send a lot of HTTP requests to GitHub. After n number of successful requests, I get HTTP 403: Forbidden
with the message API Rate Limit Exceeded
.
Is there a way to increase the API Rate limit or to bypass it altogether for GitHub?
When using GITHUB_TOKEN , the rate limit is 1,000 requests per hour per repository. For requests to resources that belong to an enterprise account on GitHub.com, GitHub Enterprise Cloud's rate limit applies, and the limit is 15,000 requests per hour per repository.
What happens when a customer reaches the Events API rate limit? The response to the API call will say 429 - Request Limit Exceeded and PagerDuty will not ingest the event.
This is a relative solution, because the limit is still 5000 API calls per hour, or ~80 calls per minute, which is really not that much.
I am writing a tool to compare over 350 repositories in an organization and to find their correlations. Ok, the tool uses python for git/github access, but I think that is not the relevant point, here.
After some initial success, I found out that the capabilities of the GitHub API are too limited in # of calls and also in bandwidth, if you really want to ask the repos a lot of deep questions.
Therefore, I switched the concept, using a different approach:
Instead of doing everything with the GitHub API, I wrote a GitHub Mirror script that is able to mirror all of those repos in less than 15 minutes using my parallel python script via pygit2.
Then, I wrote everything possible using the local repositories and pygit2. This solution became faster by a factor of 100 or more, because there was neither an API nor a bandwidth bottle neck.
Of course, this did cost extra effort, because the pygit2 API is quite a bit different from github3.py that I preferred for the GitHub solution part.
And that is actually my conclusion/advice: The most efficient way to work with lots of Git data is:
clone all repos you are interested in, locally
write everything possible using pygit2, locally
write other things, like public/private info, pull requests, access to wiki pages, issues etc. using the github3.py API or what you prefer.
This way, you can maximize your throughput, while your limitation is now the quality of your program. (also non-trivial)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With