Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scaling a rich domain model

Domain Driven Design encourages you to use a rich domain model. This means all the domain logic is located in the domain model, and that the domain model is supreme. Persistence becomes an external concern, as the domain model itself ideally knows nothing of persistence (e.g. the database).

I've been using this in practice on a medium-size one-man project (>100k lines of Java) and I'm discovering many advantages, mostly the flexibility and refactorability that this offers over a database-oriented approach. I can add and remove domain classes, hit a few buttons and an entire new database schema and SQL layer rolls out.

However, I often face issues where I'm finding it difficult to reconcile the rich domain logic with the fact that there's an SQL database backing the application. In general, this results in the typical "1+N queries problem", where you fetch N objects, and then execute a nontrivial method on each object that again triggers queries. Optimizing this by hand allows you to do the process in a constant number of SQL queries.

In my design I allow for a system to plug these optimized versions in. I do this by moving the code into a "query module" which contains dozens of domain-specific queries (e.g. getActiveUsers), of which I have both in-memory (naive and not scalable) and SQL-based (for deployment use) implementations. This allows me to optimize the hotspots, but there are two main disadvantages:

  • I'm effectively moving some of my domain logic to places where it doesn't really belong, and in fact even pushing it into SQL statements.
  • The process requires me to peruse query logs to find out where the hotspots are, after which I have to refactor the code, reducing its level abstraction by lowering it into queries.

Is there a better, cleaner way to reconcile Domain-Driven-Design and its Rich Domain Model with the fact that you can't have all your entities in memory and are therefore confined to a database backend?

like image 461
Wouter Lievens Avatar asked Dec 18 '08 16:12

Wouter Lievens


2 Answers

There are at least two ways to look at this problem, one is a technical "what can I do to load my data smarter" version. The only really smart thing I know about is dynamic collections that are partially loaded with the rest loaded on-demand, with possible preload of parts. There was an interesting talk at JavaZone 2008 about this

The second approach has been more of my focus in the time I've been working with DDD; how can I make my model so that it's more "loadable" without sacrificing too much of the DDD goodness. My hypothesis over the years has always been that a lot of DDD models model domain concepts that are actually the sum of all allowable domain states , across all business processes and the different states that occur in each business process over time. It is my belief that a lot of these loading problems get very reduced if the domain models are normalized slightly more in terms with the processes/states. This usually means there is no "Order" object because an ordrer typically exists in multiple distinct states that have fairly different semantics attached (ShoppingCartOrder, ShippedOrder, InvoicedOrder, HistoricalOrder). If you try to encapsulate this is a single Order object, you invariably end up with a lot of loading/construction problems.

But there's no silver bullet here..

like image 106
krosenvold Avatar answered Oct 16 '22 05:10

krosenvold


In my experience this is the only way to do things. If you write a system that attempts to completely hide or abstract the persistence layer then there is no way that you can optimise things using the specifics of the persistence layer.

I have been running up against this issue recently and have been working on a solution where persistence layers can choose to implement interfaces that represent optimisations. I have just been playing with it but to use your ListAUsers example it goes like this...

First write a ListAllUsers method that does everything in the domain level. For a while this will work, Then it will begin get too slow.

When using the rich domain model gets slow create an interface called "IListActiveUsers" (or probably something better). And have your persistence code implement this Interface using whetever tecniques are appropriate (probably optimised SQL).

Now you can write a layer that checks these interfaces and calls the specific method if it exists.

This isn't perfect and I dont have a lot of experience with this sort of thing. But it seems to me that the key is to ensure that if you are using a totally naive persistence method then all code should still work. Any optimization needs to be done as an addition to this.

like image 31
Jack Ryan Avatar answered Oct 16 '22 04:10

Jack Ryan