Chunking up our code

No matter how smart or intelligent we are, when things become too complex, we can no longer keep them in our brain completely. I recently saw this tweet from Kent Beck that makes a lot of sense.

The goal of software design is to create chunks or slices that fit into a human mind. The software keeps growing but the human mind maxes out, so we have to keep chunking and slicing differently if we want to keep making changes.

This limit will impact development. It takes more time to develop something and we will introduce more bugs.

Ok, cool, makes sense. Let’s chunk our code up. But how? How big should we make a chunk? Or how small? If a chunk grows too big, it will become too complex again. But if we make the chunk too small, we will just move the complexity to some other place (looking at you, max-100-lines-of-code-microservices!). Domain Driven Design to the rescue!

DDD and bounded context

Domain Driven Design, DDD in short, is an approach to developing software that helps us manage the complexity in our software. When using DDD, we will try to create a technical model that is as close as possible to the mental model of the user. Language will be one of our most important guidelines for doing this.

DDD describes a set of patterns which will help us in modeling the software. One of these patterns is called the Bounded Context and it will help us chunk up the software. Of course, language plays an important role in defining what a bounded context is.

In language, the same word sometimes has a different meaning. Take for example the following sentence: There’s a crane sitting on that crane. For anyone who understands English, it’s obvious that the first occurrence of crane means the bird and the second occurrence means the construction tool. The reason we can make this distinction in such a short sentence is context. A construction tool can not sit, so it must be the bird. The word sit provides us enough context to know what we mean.

Kristoferb – Tower crane atop Mont Blanc, France, licensed under CC BY-SA 3.0
Dawn Huczek – Double Trouble, licensed under CC BY 2.0

When working in a business, the same happens. Although, the difference in meaning is often a lot more subtle. When working in the conference organizing business, the word venue can mean many different things. Of course, the real world concept will always be that same building (or buildings).

However, how someone thinks about a venue will differ based on their point of view. A conference organizer will talk about things like total capacity, parking spots, proximity to public transport, amount of rooms, and so on. When someone is attending a conference, they care about the address of the venue, layout plan, and location of the toilets.

There will be some common words, like rooms, but while the organizer wants to know about the capacity of a room, the attendee is mostly interested in who is speaking there. We could say we have identified 2 different bounded contexts, in short BC, here. One BC for the attendee and one for the organizer.

Ubiquitous language

When eavesdropping on 2 people talking, we will be able to decide whether they organize a conference or attend a conference based on the words they use in the conversation. This means that the language they use is bound to that context. DDD calls this language the Ubiquitous Language.

A bounded context is characterized by what is called a ubiquitous language where every term has a precise and unambiguous definition. This language is consistent with the business language. This tends to set a limit on the size of a bounded context since the more you enlarge it, the more ambiguity creeps in. — Ben Morris, Chief Architect @ Mintel

When writing software for conferences, we can use this insight to chunk up our software and make the complexity inherent to the business more manageable.

In practice

How can we apply this knowledge when developing software? What exactly does it mean to chunk up our software in bounded contexts?

The basic idea is that we have to separate the code in Bounded Contexts completely from each other. We should treat them almost as separate systems. Code that lives in one BC, should not directly interact with code in another BC.

The safest way to ensure this, would be to actually treat each BC as a separate system and move them all to their own git repository. However, this comes with some tradeoffs which we don’t always want. When our entire system is still small, this may cause our infrastructure to become needlessly complicated.

When our development team is smaller, it’s often easier to have everything in 1 repository instead of many different repositories. Not every team is ready to start maintaining many different systems next to each other.

Luckily there is a middle ground. We can keep our BCs in the same repository and still keep all their advantages. We will just have to be a little bit more careful about keeping everything separated properly.

We can let ourselves be guided by some heuristics:

  • How easy is it to let different bounded contexts evolve at a separate pace?
  • How autonomous is our BC (or how easy/cheap is it to move our BC to a separate repository)?

They aren’t necessarily the end goals but serve as good guidelines when we’re not sure how we should handle something.

An obvious and easy way to start out is by organizing our code according to our bounded contexts. In PHP we can do this by letting each bounded context live in its own namespace. This separation should happen as close as possible to the root of our application.

In Laravel, this would mean dismissing the default Laravel directory structure and let each bounded context live in the app directory as in the example below. Planning, Sponsoring, and TicketSales are different BCs.

├── app
│ ├── Planning
│ ├── Sponsoring
│ └── TicketSales
├── bootstrap
├── config
├── database
├── public
├── resources
├── routes
├── storage
├── tests
└── vendor

Dismissing the default Laravel directory structure doesn’t mean we shouldn’t use it at all, but we can make that decision for each bounded context separately — in some of them it may make sense; in others, it doesn’t.

It needs mentioning that each bounded context can contain classes with the same name. Both Planning and TicketSales can have a Tutorial class. They may even have some common properties. This is perfectly fine. Both classes will have a different purpose and should evolve separately.

Classes such as controllers (or CLI Commands, for example) should also live inside their bounded context. A BC is about separating behavior, and controllers are the entry points for this behavior. A controller should not rely on the behavior of multiple BCs. Remember that we are trying to create a technical model that is as close as possible to the mental model of the user of our application.

We already established that there are multiple mental models living next to each other (Bounded Contexts). Controllers are the behavior we expose to the user, so when we expose behavior that spans multiple BCs, we are violating the mental model the user has of our system.

This does not mean that changes in one BC can’t have effects in another BC, but we’ll get to that later.

We should be wary of one BC directly calling code in another BC. Doing so will break the boundaries around our context. If we change the called upon code, the calling code will also need to change, or will at least change its behavior. That means it becomes harder to let both BC evolve at a different pace, which is something we want to avoid. We will talk about how to handle communication between BCs later.

Maybe less obvious, but at least as important, is that each BC uses its own database (or other persistence mechanism) schema. A database is another part of our system where it’s easy to create the coupling we want to avoid. The structure of our database is often influenced by the structure of our code and vice versa. When 2 BCs have a common table and we change code in one of them that changes the table structure, the other BC will have to change as well.

An easy way to ensure this separation of database structure is to just have each BC use its own database. That way, there can be no confusion about which BC owns a certain table. This means there may be some data duplication between the two databases. This is a price we must pay to reduce the complexity of the system as a whole. When working with duplicate data, it’s important you have a clear source of truth for this data. One BC should be responsible for managing this data; the other BC should stick to reading only.

Obviously, what happens in one BC should often have an effect in other BCs. If a speaker cancels in the Planning BC, something could change in the TicketSales BC. Maybe we need to trigger some refunds or stop the sale of certain tickets. If this wasn’t possible, our software would quickly become very limited.

There are a couple of ways different systems can communicate with each other. Planning could perform an API call to TicketSales and tell the system the speaker cancelled their talk. However, this is not very scalable. Planning should be aware of all other BCs that need to know about speaker cancellations and needs to be aware of all the APIs they expose and how to make a call to each of them. Imagine this happening between many different BCs, for many different use cases. Our nicely chunked up code quickly becomes an unmaintainable mess of API calls between different systems.

Now we could do the reverse API call. We could have TicketSales just ask Planning about this speaker. This adds a few problems as well. When should we ask this? When a possible attendee wants to buy a ticket? Then what if the Planning system is down? How do we react to that? What if we need to make multiple API calls like this? The system will just become slow.

A more elegant method is using events. When a speaker cancels their talk, Planning emits an event, SpeakerCancelledTalk. Any other BC that needs this event can subscribe to this event and upon receiving, act appropriately. If we persist these events somewhere centrally, in a Kafka system for example, we can even make sure that no BC ever misses an event.

When using events to communicate between BCs, we need to make sure we treat them as an API used by external systems. Other BCs depend on the name and structure of our events. We must make sure that we don’t make backwards compatibility breaking changes, without our other BCs being ready for that.

When we apply the rules above, it will be easy to scale our system and still keep it maintainable. Our BCs may grow and split up into more BCs, or they may converge and merge in the end. However, when we keep strictly to the rules above, we will not end up with a system that no longer fits in our head or can no longer be reasoned about. If we ever decide that we still need different repositories, it will be easy enough to extract our BCs to their own separate repositories in which they can continue their journey.