Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When fetching data using an api, is it best to store that data on another database, or is it best to keep fetching that data whenever you need it? [duplicate]

I'm using the TMDB api in order to fetch information such as film titles and release years, but am wondering whether I need to create an extra database in order to store all this information locally, rather than keep having to use the api to get the info? For example, should I create a film model and call:

film.title

and by doing so accessing a local database with the title stored on it, or do I call:

Tmdb::Movie.detail(550).title

and by doing so making another call to the api?

like image 673
Adam Browner Avatar asked Sep 09 '16 11:09

Adam Browner


People also ask

Can data from an API request be stored into a database?

Store data into a databaseAfter data is pulled from the API and the initial sync is complete, it will be stored in a tabular database that will sync with the resource. When the API updates, the resource should update its content and your project table should reflect that update as well.

How does API store data in database?

Collections can be created from array data using the collect() helper. With this code, we call the saveMatches() function, which calls the getMatches() function (used to run the guzzle call). Then, for each match, we save a new record to the database use the Match facade.

Where does an API store data?

A database where your application can store its data. This could be a database server you are running, such as MySQL or Postgres, or it could be a BaaS (backend as a service) DB such as Firebase. A server, likely a VPS (Virtual Private Server) that is accessible to the internet, where your application can run.

Which API is used to fetch the records?

This API is used to fetch all the records of a form based on view name as well as you can get records based on predefined columns.

Which API is used to connect between application and database?

DB-API is an acronym for DataBase Application Programming Interface and a library that lets Python connect to the database server. Depending on the relational DB library you use, they may have their own DB-API modules.


3 Answers

Having dealt with a large Rails application that made service calls to about a dozen other applications, caching is your best bet. The problem with the database solution is keeping it up to date. The problem with making the calls every time is that it's too slow. There is a middle ground. For this you want Ruby on Rails Low Level Caching:

Sometimes you need to cache a particular value or query result instead of caching view fragments. Rails' caching mechanism works great for storing any kind of information.

The most efficient way to implement low-level caching is using the Rails.cache.fetch method. This method does both reading and writing to the cache. When passed only a single argument, the key is fetched and value from the cache is returned. If a block is passed, the result of the block will be cached to the given key and the result is returned.

An example that is pertinent to your use case:

class TmdbService
  def self.movie_details(id)
    Rails.cache.fetch("tmdb.movie.details.#{id}", expires_in: 4.hours) do
      Tmdb::Movie.detail id
    end
end

You can then configure your Rails application to use memcached or the database for the cache, it doesn't matter. The point is you want this cached data to expire at some point to ensure you are getting up-to-date information.

like image 197
Greg Burghardt Avatar answered Sep 30 '22 15:09

Greg Burghardt


This is a big decision to make. If the amount of data you get through the API is not huge you can store all of it in your database. This way you will get the data much faster and your application will work even when the API is down.

If the amount of data you get is huge and you don't have sources to store all the data, you should at least store the most important data in your database as cache.

If you do not store any data on you own you are dependent on the source of data and it can have downtime.

Problem with storing data on your side is when the data change and you need to synchronize. In that case it is still good to store data on your side as cache to get results faster and synchronize the data periodically.

like image 42
jan.zikan Avatar answered Sep 30 '22 16:09

jan.zikan


Calls to a local database are way faster than calls to external APIs. I would expect a local database to return within a few milliseconds, whereas an API will probably take hundreds of milliseconds. And local calls are less likely effected by network issues or downtimes.

Therefore I would always cache the result of an API call in a local database and occasionally updated the local version with a newer version from the API.

But in the end it depends on your requirement: Do you need real-time or is a cached version okay? How often do you need that data and how often is is updated? How fast is the API and is latency an issue? Does the API have a rate limit (a maximum number of request per time)?

like image 39
spickermann Avatar answered Sep 30 '22 17:09

spickermann