Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Caching Github API calls

I have a general question related to caching of API calls, in this instance calls to the Github API.

Let's say I have a page in my app that shows the filenames of a repo, and the content of the README. This means that I will have to do a few API calls in order to retrieve that.

Now, let's say I want to add something like memcached in between, so I'm not doing these calls over and over, if I don't need to.

How would you normally go about this? If I don't enable a webhook on Github, I have no way of knowing whether the cache should expire. I could always make a single call to get the current sha of HEAD, and if it hadn't changed, use cache instead. But that's on a repo-level, and not on a file level.

I can imagine I could do something like that with the object-sha's, but if I need to call the API anyway to get those, it defeats the purpose of caching.

How would you go about it? I know a service like prose.io has no caching right now, but if it should, what would the approach be?

Thanks

like image 509
Ronze Avatar asked Feb 15 '13 00:02

Ronze


People also ask

Can you cache API calls?

Caching in REST APIs POST requests are not cacheable by default but can be made cacheable if either an Expires header or a Cache-Control header with a directive, to explicitly allows caching, is added to the response. Responses to PUT and DELETE requests are not cacheable at all.

When would you use API caching?

The Cache API was created to enable service workers to cache network requests so that they can provide fast responses, regardless of network speed or availablity. However, the API can also be used as a general storage mechanism.

Can we do caching in Web API?

Caching is very common to make applications performant and scalable. If a result is already computed by the application, it is cached in a store so that next time when the same request comes, cached result can be fetched instead of processing the request again.


1 Answers

Would just using HTTP caching be good enough for your use case? The purpose of HTTP caching is not just to provide a way of not making requests if you already have a fresh response, rather - it also enables you to quickly validate if the response you already have in cache is valid (without the server sending the complete response again if it is fresh).

Looking at GitHub API responses, I can see that GitHub is correctly setting the relevant HTTP headers (ETag, Last-modified, Cache-control).

So, you just do a GET, e.g. for:

GET https://api.github.com/users/izuzak/repos

and this returns:

200 OK
...
ETag:"df739f00c5053d12ef3c625ad6b0fd08"
Last-Modified:Thu, 14 Feb 2013 22:31:14 GMT
...

Next time - you do a GET for the same resource, but also supply the relevant HTTP caching headers so that it is actually a conditional GET:

GET https://api.github.com/users/izuzak/repos
...
If-Modified-Since:Thu, 14 Feb 2013 22:31:14 GMT
If-None-Match:"df739f00c5053d12ef3c625ad6b0fd08"
...

And lo and behold - the server returns a 304 Not modified response and your HTTP client will pull the response from its cache:

304 Not Modified

So, GitHub API does HTTP caching right and you should use it. Granted, you have to use an HTTP client that supports HTTP caching also. The best thing is that if you get a 304 Not modified response - GitHub does not decrease your remaining API calls quota. See: https://docs.github.com/en/rest/overview/resources-in-the-rest-api#conditional-requests

GitHub API also sets the Cache-Control: private, max-age=60 header, so you have 60 seconds of freshness -- which means that requests for the same resource made less than 60 seconds apart will not even be made to the server.

Your reasoning about using a single conditional GET request to a resource that surely changes if anything in the repo changed (a resource showing the sha of HEAD, for example) sounds reasonable -- since if that resource hasn't changed, then you don't have to check the individual files since they haven't surely changed.

like image 137
Ivan Zuzak Avatar answered Sep 20 '22 18:09

Ivan Zuzak