I wonder if it's a good idea letting local applications (in the same server) communicate with each other entirely through Restful API?
I know this is not an uncommon thing, cause we already have applications like CouchDB that is using HTTP REST for communication, even with local applications.
But I want to take it to a higher level by creating applications that act like modules for a bigger application, which could also be a module for another application and so on. In other words, there will be a lot of local applications/modules that are communicating with Restful API.
In this way these applications/modules could be in any language and they could communicate over the wire between servers.
But I have some questions:
Sure, perhaps.
Yup! But compared to what? Compared to native, internal calls, absolutely -- it'll be glacial. Compared to some other network API, eh, not necessarily slower.
Nah, no reason to allocate a port per module. All sorts of ways to do this.
The only way this will succeed is if the services you are talking about are coarse enough. These have to be big, black boxy kinds of services that make the expense of calling them worthwhile. You will be incurring connection costs, data transfer costs, and data marshaling cost on each transaction. So, you want those transactions to be as rare as possible, and you want the payloads to be as large as possible to get the best benefit.
Are you talking actually using the REST architecture or just sending stuff back and forth via HTTP? (These are different things) REST incurs its own costs, include embedded linkages, ubiquitous and common data types, etc.
Finally, you simply may not need to do this. It might well be "kinda cool", a "nice to have", "looks good on the white board", but if, really, don't need it, then don't do it. Simply follow good practices of isolating your internal services so that should you decide later to do something like this, you can just insert the glue layer necessary to manage the communication, etc. Adding remote distribution will increase risk, complexity and lower performance, (scaling != performance) so there should be a Good Reason to do it at all.
That, arguably, is the "best practice" of them all.
Edit -- Response to comment:
So you mean I run ONE web server that handle all incoming requests? But then the modules won't be stand-alone applications, which defeats the whole purpose. I want each one of the modules to be able to run by itself.
No, it doesn't defeat the purpose.
Here's the deal.
Let's say you have 3 services.
http://store.example.com/service1
http://blog.example.com/service2
http://forum.example.com/service3
At a glance, it would be fair to say that these are three different services, on 3 different machines, running in 3 different web servers.
But the truth is that these can all be running on the SAME machine, on the SAME web server, even down to (to take this to the extreme) running the exact same logic.
HTTP allows you to map all sorts of things. HTTP itself is mechanism of abstraction.
As a client all you care about is the URL to use and the payload to send. What machine it ends up talking to, or what actual code it executes it not the clients problem.
At an architecture level, you have achieved a manner of abstraction and modularization. The URLs let you organize you system is whatever LOGICAL layout you want. The PHYSICAL implementation is distinct from the logical view.
Those 3 services can be running on a single machine served by a single process. On the other hand, they can represent 1000 machines. How many machines do you think respond to "www.google.com"?
You can easily host all 3 services on a single machine, without sharing any code save the web server itself. Making it easy to move a service from its original machine to some other machine.
The host name is the simplest way to map a service to a machine. Any modern web server can service any number of different hosts. Each "virtual host" can service any number of individual service endpoints within the name space of that host. At the "host" level its trivial to relocate code from one machine to another if and when you have to.
You should explore more the capabilities of a modern web server to direct arbitrary requests to actual logic on the server. You'll find them very flexible.
Is this a good idea?
Yes. It's done all the time. That's how all database servers work, for example. Linux is packed full of client/server applications communicating through TCP/IP.
Will the data transfer between them be slow?
No. TCP/IP uses localhost
as a short-cut to save doing actual network I/O.
The HTTP protocol isn't the best thing for dedicated connections, but it's simple and well supported.
If I do this, then each application/module have to be a HTTP server right?
Not necessarily. Some modules can be clients and not have a server.
So if my application uses 100 applications/modules then each one of these have to be a local HTTP web server each running on a different port (http://localhost:81, http://localhost:82, http://localhost:83 and so on) right?
Yes. That's the way it works.
Any best practices/gotchas that I should know of?
Do not "hard-code" port numbers.
Do not use the "privileged" port numbers (under 1024).
Use a WSGI library and you'll be happiest making all your modules into WSGI applications. You can then use a trivial 2-line HTTP server to wrap your module.
Read this. http://docs.python.org/library/wsgiref.html#examples
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With