I don't understand SOA (Service-oriented Architecture) and databases. While I'm attracted by the SOA concept (encapsulating reusable business logic into services) I can't figure out how it's supposed to work if data tables encapsulated in a service are required by other services/systems---or is SOA suitable at all in this scenario?
To be more concrete, suppose I have two services:
CustomerService
: contains my Customers
database table and associated business logic.OrderService
: contains my Orders
table and logic.Now what if I need to JOIN
the Customers
and Orders
tables with an SQL statement? If the tables contain millions of entries, unacceptable performance would result if I have to send the data over the network using SOAP/XML. And how to perform the JOIN
?
Doing a little research, I have found some proposed solutions:
If you have any input on this, please let me know.
One of the defining principals of a "service" in this context is that it owns, absolutely, that data in the area it is responsible for, as well as operations on that data.
Copying data, through replication or any other mechanism, ditches that responsibility. Either you replicate the business rules, too, or you will eventually wind up in a situation where you wind up needing the other service updated to change your internal rules.
Using a single data service is just "don't do SOA"; if you have one single place that manages all data, you don't have independent services, you just have one service.
I would suggest, instead, the third option: use composition to put that data together, avoiding the database level JOIN operation entirely.
Instead of thinking about needing to join those two values together in the database, think about how to compose them together at the edges:
When you render an HTML page for a customer, you can supply HTML from multiple services and compose them next to each other visually: the customer details come from the customer service, and the order details from the order service.
Likewise an invoice email: compose data supplied from multiple services visually, without needing the in-database join.
This has two advantages: one, you do away with the need to join in the database, and even the need to have the data stored in the same type of database. Now each service can use whatever data store is most appropriate for their need.
Two, you can more easily change the outside of your application. If you have small, composable parts you can easily add rearrange the parts in new ways.
The guiding principle is that it is ok to cache immutable data This means that simple immutable data from the customer entity can exist in the order service and there's no need to go to the customer service every time you need the info. Breaking everything to isolated services and then always making these remote procedure calls ignores the fallacies of distributed computing. If you have extensive reporting needs you need to create an additional service. I call that Aggregated Reporting service, which, again gets read-only data for reporting purposes. You can see an article I wrote about that for InfoQ a few years ago
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With