My company has this Huge Database that gets fed with (many) events from multiple sources, for monitoring and reporting purposes. So far, every new dashboard or graphic from the data is a new Rails app with extra tables in the Huge Database and full access to the database contents.
Lately, there has been an idea floating around of having external (as in, not our company but sister companies) clients to our data, and it has been decided we should expose a read-only RESTful API to consult our data.
My point is - should we use an API for our own projects too? Is it overkill to access a RESTful API, even for "local" projects, instead of direct access to the database? I think it would pay off in terms of unifying our team's access to the data - but is it worth the extra round-trip? And can a RESTful API keep up with the demands of running 20 or so queries per second and exposing the results via JSON?
Thanks for any input!
I think there's a lot to be said for consistency. If you're providing an API for your clients, it seems to me that by using the same API you'll understand it better wrt. supporting it for your clients, you'll be testing it regularly (beyond your regression tests), and you're sending a message that it's good enough for you to use, so it should be fine for your clients.
By hiding everything behind the API, you're at liberty to change the database representations and not have to change both API interface code (to present the data via the API) and the database access code in your in-house applications. You'd only change the former.
Finally, such performance questions can really only be addressed by trying it and measuring. Perhaps it's worth knocking together a prototype API system and studying it under load ?
I would definitely go down the API route. This presents an easy to maintain interface to ALL the applications that will talk to your application, including validation etc. Sure you can ensure database integrity with column restrictions and stored procedures, but why maintain that as well?
Don't forget - you can also cache the API calls in the file system, memory, or using memcached (or any other service). Where datasets have not changed (check with updated_at or etags) you can simply return cached versions for tremendous speed improvements. The addition of etags in a recent application I developed saw HTML load time go from 1.6 seconds to 60 ms.
Off topic: An idea I have been toying with is dynamically loading API versions depending on the request. Something like this would give you the ability to dramatically alter the API while maintaining backwards compatibility. Since the different versions are in separate files it would be simple to maintain them separately.
Also if you use the Api internally then you should be able to reduce the amount of code you are having to maintain as you will just be maintaining the API and not the API and your own internal methods for accessing the data.
I've been thinking about the same thing for a project I'm about to start, whether I should build my Rails app from the ground up as a client of the API or not. I agree with the advantages already mentioned here, which I'll recap and add to:
On top of that, you also get:
Criticism
One problem I originally saw with this approach was that it would make me lose all the amenities and flexibilities that ActiveRecord provides, with associations, named_scopes and all. But using the API through ActveResource brings a lot of the good stuff back, and it seems like you can also have named_scopes. Not sure about associations.
More Criticism, please
We've been all singing the glories of this approach but, even though an answer has already been picked, I'd like to hear from other people what possible problems this approach might bring, and why we shouldn't use it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With