How does data denormalization work with the Microservice Pattern?

Tags:

I just read an article on Microservices and PaaS Architecture. In that article, about a third of the way down, the author states (under Denormalize like Crazy):

Refactor database schemas, and de-normalize everything, to allow complete separation and partitioning of data. That is, do not use underlying tables that serve multiple microservices. There should be no sharing of underlying tables that span multiple microservices, and no sharing of data. Instead, if several services need access to the same data, it should be shared via a service API (such as a published REST or a message service interface).

While this sounds great in theory, in practicality it has some serious hurdles to overcome. The biggest of which is that, often, databases are tightly coupled and every table has some foreign key relationship with at least one other table. Because of this it could be impossible to partition a database into n sub-databases controlled by n microservices.

So I ask: Given a database that consists entirely of related tables, how does one denormalize this into smaller fragments (groups of tables) so that the fragments can be controlled by separate microservices?

For instance, given the following (rather small, but exemplar) database:

[users] table ============= user_id user_first_name user_last_name user_email  [products] table ================ product_id product_name product_description product_unit_price  [orders] table ============== order_id order_datetime user_id  [products_x_orders] table (for line items in the order) ======================================================= products_x_orders_id product_id order_id quantity_ordered

Don't spend too much time critiquing my design, I did this on the fly. The point is that, to me, it makes logical sense to split this database into 3 microservices:

UserService - for CRUDding users in the system; should ultimately manage the [users] table; and
ProductService - for CRUDding products in the system; should ultimately manage the [products] table; and
OrderService - for CRUDding orders in the system; should ultimately manage the [orders] and [products_x_orders] tables

However all of these tables have foreign key relationships with each other. If we denormalize them and treat them as monoliths, they lose all their semantic meaning:

[users] table ============= user_id user_first_name user_last_name user_email  [products] table ================ product_id product_name product_description product_unit_price  [orders] table ============== order_id order_datetime  [products_x_orders] table (for line items in the order) ======================================================= products_x_orders_id quantity_ordered

Now there's no way to know who ordered what, in which quantity, or when.

So is this article typical academic hullabaloo, or is there a real world practicality to this denormalization approach, and if so, what does it look like (bonus points for using my example in the answer)?

605

asked Nov 19 '14 01:11

smeeb

1 Answers

This is subjective but the following solution worked for me, my team, and our DB team.

At the application layer, Microservices are decomposed to semantic function.
- e.g. a Contact service might CRUD contacts (metadata about contacts: names, phone numbers, contact info, etc.)
- e.g. a User service might CRUD users with login credentials, authorization roles, etc.
- e.g. a Payment service might CRUD payments and work under the hood with a 3rd party PCI compliant service like Stripe, etc.
At the DB layer, the tables can be organized however the devs/DBs/devops people want the tables organized

The problem is with cascading and service boundaries: Payments might need a User to know who is making a payment. Instead of modeling your services like this:

interface PaymentService {     PaymentInfo makePayment(User user, Payment payment); }

Model it like so:

interface PaymentService {     PaymentInfo makePayment(Long userId, Payment payment); }

This way, entities that belong to other microservices only are referenced inside a particular service by ID, not by object reference. This allows DB tables to have foreign keys all over the place, but at the app layer "foreign" entities (that is, entities living in other services) are available via ID. This stops object cascading from growing out of control and cleanly delineates service boundaries.

The problem it does incur is that it requires more network calls. For instance, if I gave each Payment entity a User reference, I could get the user for a particular payment with a single call:

User user = paymentService.getUserForPayment(payment);

But using what I'm suggesting here, you'll need two calls:

Long userId = paymentService.getPayment(payment).getUserId(); User user = userService.getUserById(userId);

This may be a deal breaker. But if you're smart and implement caching, and implement well engineered microservices that respond in 50 - 100 ms each call, I have no doubt that these extra network calls can be crafted to not incur latency to the application.

138

answered Sep 19 '22 04:09

smeeb

Related questions
                            
                                How to dispose TransactionScope in cancelable async/await?
                            
                                PGError: ERROR: permission denied for relation (when using Heroku)
                            
                                Indexing boolean fields
                            
                                When to use JCR (content repository) over other options?
                            
                                PDO::PARAM for type decimal?
                            
                                Django's ManyToMany Relationship with Additional Fields
                            
                                Insert 2 million rows into SQL Server quickly
                            
                                How do databases work internally? [closed]
                            
                                Android database encryption
                            
                                What is the difference between HSET and HMSET method in redis database
                            
                                Hidden Features of PostgreSQL [closed]
                            
                                Still Confused About Identifying vs. Non-Identifying Relationships
                            
                                Adding a leading zero to some values in column in MySQL
                            
                                How to delete mysql database through shell command
                            
                                Is there a way to get a list of column names in sqlite?
                            
                                What's your #1 way to be careful with a live database? [closed]
                            
                                A list of Entity Framework providers for various databases
                            
                                Mysql database sync between two databases
                            
                                How to update selected rows with values from a CSV file in Postgres?
                            
                                Convert datetime value into string

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How does data denormalization work with the Microservice Pattern?

Tags:

database

denormalization

microservices

smeeb

People also ask

1 Answers

smeeb

Recent Activity

Donate For Us