Moving to Domain Driven Design


A while ago I’ve started learning about Domain Driven Design (DDD). I was a CRUD guy, with Database Driven Design in mind.

In my meetings with the domain specialists at my work, when they were talking about a new feature, in my mind I was finishing the implementation at the point they stopped talking. I was very rapid, I knew my programming languages (mostly PHP and JavaScript) and the business domain very good so this was simple.
BUT, there was something wrong and I didn’t realized it only after I read this book. During those meetings, I was often stepping in and commenting about the implementation details: “Aha, this means we need to add another object.” or “Yes, this could be done with yet another many to many relation from this object type to that object type”. I had object descriptors and the SQL database was auto-updated semi-automatic to reflect those descriptors.  After some years, they started to think like me, they started to understand this software architecture and even could estimate the features time costs. I was that good at programming that I was never saying “NO”. So far so good.

BUT, when the system developed complex behavior, things started to be difficult. More and more often I was saying “This is difficult, we can’t mix object types in this query, don’t you already know?” or “Hmm, I can do that but the code will be very difficult to maintain by the new/junior developers, do you really want it?”. Another kind of NO was “Our MySQL with not like this query, I can’t think of an index to speed things up; there are already 10 indexes, I can’t add this new long index, it will slow down the editing of products”. I’ve tried to scale horizontally the MySQL database but the commercial solution were to expensive and I was not convinced that they will really speed up enough, at best we would delay the inevitable.

So, I needed something different. I didn’t understand were the problem was. I really knew everything about PHP and MySQL. I thought even to switch to NoSQL but MongoDB didn’t have have joins. It had collections with nested documents  but that was not helping me. My data was normalized, just as the best practices were saying. I was doing everything by the book and still failing. My code base started to look like a monster. I had and endless bunch of CRON scripts that were trying to denormalize the data, to duplicate user nicknames, company names so I could spare an SQL JOIN.

I even made a framework that contained 50% of the code, a bootstrap framework that will help me develop very rapidly a new website, full with DAOs, SQL migration tools, SQL query builders and many other shared PHP classes. The other 50% were specific object descriptors, some controllers and UI details. On the administration panel I was using the Smart UI (Anti-)Pattern: that seemed very cool, the UI would adapt itself to the object descriptors without any additional effort from the developers.

So, things started to be harder and harder but the salvation came: Domain Driven Design. I must say: I could not accept that book, it was invalidating all I knew about software development; that was painful, very painful. Something broke in me.

Fortunately that moment passed. I started to see what was wrong with my approach: Database Driven Design was limiting my thinking; then I realized: the Database was not the most important thing, the center of the Universe, the Domain was. The Domain had to be reflected in my code, not the Database. So I started to accept the things Eric Evans was saying. But it was not easy. No, no.

The first thing that I remember to not accepting easily was the Aggregate. I was thinking: “What do you mean by loading+persisting ALL the entities in that Aggregate together? That will be very very slow!”. What I know now is that, indeed, it is slower but not very very slow and the benefits outweigh that loss of speed. The idea is that computers are faster and faster but people are not. A big ball of mud has a much worse effect on the business that the loss in speed on the write side of the application (the Aggregate persisting side). This was my problem: my code became a monstrous Big Ball of Mud. But Aggregates were not helping me to clear things up. They helped to ensure that invariants hold. Before Aggregates my models (PHP classes) had only getters and setters. A junior programmer could easily put an Entity in an invalid state and persist it. It had only to call a single setter and then $object->save(); .

I remember that I had problems understanding the Ubiquitous language importance and the Bounded Contexts. I was thinking “Why should I split that class, why not keep these two behaviors in the same Product class? It is the same Product, with the same ID in the Inventory and in the Catalog, right?” Wrong! Although it refers to the same physical thing from the real life, a Product in the Catalog is modeled differently from the Product in the Inventory. They are both Products but they have different properties and different behaviors attached to those properties; you don’t need to rename them CatalogProduct and InventoryProduct. You should name them Product but put them in different namespaces, one namespace for every Bounded context. That is a Bounded Context in PHP terms: a namespace. If you have two classes that have the same short name but in different namespaces then you have to different models. Ah, an one other thing: don’t use inheritance for Products, even if they have some common fields. Keep the Products clean, with no dependencies. If you create two classes instead of one things will start to clear up.

For the architecture, at that time, after reading the book, I chose Layered architecture. But, although DDD helped me to capture the Domain in my code, my code still had something wrong. My models had two sides, two faces, two interfaces. Some code was about enforcing the invariants on the state and the other code was used for queries and those two kind of codes were very different. The only thing that was keeping the two interfaces together was the data, the state. But that data was used differently for commands, the methods that modified the data, and for the queries. The command wanted raw data and the queries were transforming that raw data into formatted strings, the things that the user sees on the screen.

Then, the second salvation came: CQRS. I must say, CQRS really really cleanup my code. Greg Young told me to make two classes instead of one. So I did and the result was formidable: the code finally became cleaner to a satisfactory level. After some working with CQRS I saw the benefits. I should give you an advice: CQRS will not reveal itself to you easily. You must not give up, just keep trying until you will see it. You will be like Neo in Matrix, you will see things differently. You will not have to dodge bullets, you will stop them with your hand. Bullet is the domain complexity and you dodging is the splitting of concerns in two classes. A very very interesting thing about CQRS is the Query part. The Read Models, the classes that are used to display the data to the users are very easy optimized for every use case; you can create a table for every query that you need, without the need to use JOINS! Really, no JOINS! Do you feel what that means? It means the perfect optimizations are possible. It means that you have the possibility to create The Perfect Cache: really fast, easily invalidable and pre-created ahead of time. If you can’t easily forget your old friend you can use it: here you can use MySQL, for  the read side persistence.

You could wonder what is more, how things could get even better? Well, for passionate programmers, things always get better. After CQRS I discovered Event Sourcing. After understanding it (or so I think 🙂 ), the databases really became not important; I moved it to the outer margins of my code. Event sourcing says that the state of the application is persisted not as a snapshot but as the series of events that will be used to rebuild it every time a new modification is necessary. So, you don’t persist an Aggregate in a table/collection but in an Event Store. Event sourcing lead me to the discovery of the Event-driven architectures, a very nice kind of architecture, my favorite.

So, what is more? Well, what do you do when you monolith applications has 1 million lines of code? You split it again, this time into microservices. Although there are downsides to this kind of splitting, they surely are worth studying. I really like them. In my current project, at the time of writing, I’ve split some of my monolith into a separate microservice: a file storage. I haven’t made it as I wanted, using event publishing, because I don’t have the time necessary but I hope I will. Another microservice that I plan to extract is the SearchMicroservice: it will listen to all the relevant events and build a search index.

I finally should add that you could manage the complexity even more by using SOLID. Those five principles should be applied everywhere at the lower levels. SOLID should be as valuable to you as is the air that you are breathing right now.