Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to cache JSON data instead of accessing the REST endpoint

http://api.bitcoincharts.com/v1/markets.json (sample example)

I am planning to access several REST endpoints as mentioned below for data and at certain times access to some of the endpoints fail because of a connectivity error or the service being non available. I am interested only in the last snapshot of the data. In order to resolve this issue I would like to store the latest snapshot in a data store (preferably NoSQL) say Mongo or Redis and would want to modify the application logic to look at these data sources always instead of the API endpoint. This would always provide predictable data and I intend to run some CRON scripts to pull data regularly from these REST endpoints and store it in the above data sources.

http://api.foo.com/v1/foo.json
http://api.bar.com/v1/bar.json
http://api.baz.com/v1/baz.json
  1. Is there a better approach to resolve this issue?
  2. What storage would be appropriate for storing the JSON as it is and retrieve it for processing. Is it Mongo or Redis?
like image 826
Rpj Avatar asked May 14 '14 13:05

Rpj


2 Answers

You are using REST, so basically you can cache HTTP requests / responses using a simple HTTP reverse proxy with Apache HTTP, NGINX or Varnish for instance. Why bothering with NoSQL for a simple cache?

Of course MongoDB and Redis provide a lot more functionnalities but do you really need them? Look at this other question: Caching JSON objects on server side

like image 134
zenbeni Avatar answered Oct 13 '22 22:10

zenbeni


  1. When you fetch data for the first time from REST endpoints, store the data in the caching layer and return to the service. When you get the subsequent requests, check if the data exists in cache if its not present then make request to REST and fetch the data.

    You need to mention expire time while storing the data in the caching layer. This will prevent the CRON job, because instead of fetching all the data at once, fetch it only when it is required, at that time check if it has expired in the cache.

  2. I would prefer redis as it's one of the best suitable for caching layer. It's a "NoSQL" key-value data store and it's not like MongoDB which is a disk based document store. Similar to memcache, it can evict old data as you add new one. Redis is a fantastic choice if you want a highly scalable data store shared by multiple processes, multiple applications, or multiple servers.Unlike Memcache, Redis provides powerful aggregate types like sorted sets and lists. It has a configurable persistence model, where it background saves at a specified interval, and can be run in a master-slave setup. All of our Redis deployments run in master-slave, with the slave set to save to disk about every minute.

    As just an inter-process communication mechanism it is tough to beat. Its speed also makes it great as a caching layer.

like image 25
mohamedrias Avatar answered Oct 13 '22 21:10

mohamedrias