Other People’s Databases

photo 1 e1372347259389 224x300 Other Peoples DatabasesMy team has an elbow-length blue latex glove on the wall of our team area, with the words “other people’s databases” written on it.

This is our tongue-in-cheek reminder to always think twice before sticking your fingers (or more) into a database that doesn’t belong to your application.

I realized the other day that this is not a universal principle, and this came as a bit of a surprise to me. Maybe I’ve just lived with it as an axiom for so long, I assumed everyone else did as well.

The Rule

What I mean by “other people’s databases” is the rule that I follow, and my team has adopted, that says each deployable service or application in our overall system must communicate with other services and applications via established APIs. In some cases, these are asynchronous events (messages) passed between applications via an event bus. In other cases they are REST calls that one application makes to another.

In all cases, the API is one piece of application code talking to another via some established protocol.

Under no circumstances is that communication via a database, because each of our applications treats it’s database (assuming it has one) as an internal implementation detail, subject to change without any notice or consideration for other applications.

If one of our services handles customers, for instance, it might write those customers into a file on the filesystem, it might write them to MongoDB as a document, it might create them by replaying a series of domain events from a journal, or it might hold them all in memory and hope like hell the power doesn’t fail. It is not the business of any external application how the customer service does these things. All the external apps know is the API that the customer service exposes.

This is critical, as it allows us to rapidly, easily and safely change anything about a service except it’s external interface. We can change the external interface too, it’s just that we must then consider all the callers to that interface and make the change on their end (or handle multiple versions of the interface temporarily, which is usually the case).

Just as you don’t want other classes to know the internal workings of any class in your system (the open/closed principle), but just to use that class via it’s exposed interface, so you also want your services and applications to follow the same rule.

The Consequences

I’ve had some people tell me that they just love big monolithic databases that span multiple applications, that it helps them get stuff done. What I hear is that they love being able to bypass their applications logic and safeguards and muck with the critical, inner workings of the company’s lifeblood without so much as washing their hands.

Moving some of the applications logic into the database, via the abomination of stored procedures or triggers, makes a bad situation worse.

The symptom of this thinking is a gradual slow-down in the pace of change, and the increase in risk with each new feature that works on the same database. Bugs crop up in strange and subtle ways, as the data is squirming underneath the applications, which are now forced to endure anemic domains, as they can’t rely on any logic they put in their domain to actually do anything correctly. Over time, if you ask someone what their domain model is, they’ll point to their database schema. This is the terminal stage of the disease, if you’re not careful.

The schema ossifies, becoming harder and harder to change, and the release cycle slows down (if it wasn’t already slow) to allow for “scripts” or “jobs” to be run to further fiddle with data to make each modification to the code possible.

Cost of change skyrockets in this scenario.

The other cry I hear is “ad-hoc reporting”, but I’ve already dealt with that red herring here so I won’t repeat my reasoning.

My conclusion is that the so-called benefits of “sharing” a database are in fact flaws, and that they are serious and avoidable, especially in any system that needs and wants to continue to be maintainable over time.

Now, there are actual exceptions when it’s OK for two deployable units to be talking to the same database (a write model unit and a read model, for instance), but that’s a rare and carefully managed exception to the rule, and I’d recommend you not do it unless you have a team that can follow critical disciplines very carefully.

Keep your fingers out of other people’s databases, you’ll be glad you did!