Automatic Dependency Management with Maven

Apache Maven is a lot more than a “build tool”, and one of it’s major strengths is it’s ability to manage dependencies.

Maven’s not just for external dependency management, though – it can help us work faster and more easily with our own modules as well as those written by others. In fact, it’s “internal” dependency management is actually far more powerful for most development shops.

Every dependency Maven manages is identified with 3 pieces of information – it’s group id, it’s artifact id, and it’s version. Group id is often some sub-domain of the company it’s working on, e.g. com.point2.somemodule, and the artifact id helps identify the specific module with that group, like rest-api or such.

Possibly the most interesting part is the version number, though, as this is where the real power of Maven comes to the fore. Versions allow us to maximize the opportunity for parallel development without descending into unversioned chaos. Each version represents a specific point in time in a library’s development – and, most importantly, allows us to “re-assemble” our application to a known state at any time (not re-build it).

Let’s take a for-instance to see how this might work…

Component-Based Application “Assembly”
For example, let’s say I’ve got a few teams working on different modules for my new application, let’s call them “persistence”, “rest-api” and the user-interface, “ui”. Each of these modules depends on a set of common utility classes in “util”.

We can represent this through a set of triples like so:

rest-api depends-on persistence
rest-api depends-on util
ui depends-on rest-api
persistence depends-on util (directly, and not only on the transitive dependency through persistence)

The unseen aspect here is the versioning. If we include versions in our triples, we see the picture is a bit more sophisticated:

rest-api-1.0 depends on persistence-3.1
rest-api-1.0 depends-on util-1.1
ui-1.0 depends-on rest-api-1.0
persistence-3.1 depends-on util-1.0

Now we have a fully defined dependency graph that we can assemble into an application, say app-1.0. At any time, if we want a copy of the app in 1.0 state, we re-construct it from this deployed modules, no need to build any source code, and we’ve got the exact same app, every time.

Get it in motion…
Now let’s look at this in a dynamic development environment, where we’re trying to maximize sustainable velocity:

Although there are dependencies between each module, we don’t want to hold up one team by forcing them to build the other teams modules unnecessarily. We also want each team to choose if they want to work with the very latest version of the other modules, or working against a fixed and stable version for a time instead.

The “ui” team, for example, might be refactoring JavaScript code that’s relying on version 1.0.3 of “rest-api”, while “rest-api” in turn is already working on 1.0.4 – and it uses 1.1.0 of “persistence”… it can get tangled in a hurry without a way to manage it, and we don’t want to be artificially discouraged from writing modular code just because it’s hard to keep version numbers straight.

Enter Maven again. Instead of forcing everyone to just always work with the latest version of every other module (which can bring productivity to a screeching halt in some situations), we allow each time to decide what dependency they will include in their POM (Project Object Model) file.

What if I want the very latest version of “persistence” while I work on “rest-api”, with changes checked in by other developers while I’m still working? This is where the SNAPSHOT version comes into play. Instead of declaring a dependency on 1.1.0, I declare a dependency on 1.1.1-SNAPSHOT. This represents the latest “edge” code for the referenced dependency.

Now we have a graph that looks like this:

app-1.0-SNAPSHOT depends on ui-1.1-SNAPSHOT
rest-api-1.1-SNAPSHOT depends on persistence-3.1-SNAPSHOT
rest-api-1.1-SNAPSHOT depends-on util-1.1
ui-1.1-SNAPSHOT depends-on rest-api-1.1-SNAPSHOT
persistence-3.1-SNAPSHOT depends-on util-1.1

As you can see, we have a mix of stable versioned modules (util in this case), and “on the fly” versions. Yet at the same time we’re assured that major changes that break backwards compatibility will not be seen, as we indicate such changes with a change in our major version number (e.g. 1.X to 2.0).

Then we can set up a CI job (say on TeamCity, Bamboo, or whatever your CI system of choice is) to automatically build and deploy our SNAPSHOT version of “persistence” to our local Maven repository (within our company firewall). The SNAPSHOT version actually turns into a date/timestamped version when it’s deployed to Nexus, and Maven is clever enough to fetch for us the most recent of these SNAPSHOTs every time we build. The “persistence” team checks in some code, CI builds it and deploys the resulting SNAPSHOT jar to our repository, and we get it automagically the next time we build, even though we’re working on rest-api, not persistence.

When we’re ready to “stabilize” our dependencies, we simply switch from the SNAPSHOT to a specific version. Maven has a pre-defined “release” process that guarantees, among other things, that every released version has no remaining SNAPSHOT dependencies, is tagged to version control, and verified via all it’s tests. More than a build tool indeed…

We could of course just put all the modules we’re going to depend on in an aggregator POM, and build everything every time we make a change, but this is hardly efficient, and limits our development velocity unnecessarily (and of course we might not all be in the same source tree, or even the same version-control repository). We want to be building smaller pieces, not bigger ones.

A critical part of this process is our company-local Maven repository – here I mean not just the developer-local repository on each developers own workstation, but a product like Nexus that holds a company-wide copy of all required jars for a build. By doing this, we can guarantee a consistent copy of all our required dependencies without having to depend on the availability of outside repositories, such as ibiblio. It’s not a bad idea to in fact *only* permit access to the local repository for building releases, which ensures this policy is not violated accidentally – while at the same time keeping the “external” maven repo’s available to developers for experimenting and prototyping. Once something gets used in production code, however, it gets stored in the “inside the firewall” Nexus repo (and backed up from there). This avoids the bad practice of checking jar files into source control (it’s called “source” control for a reason).

Testing, Testing…
To add a new aspect to the problem, let’s say that it’s not only production code we depend on, but helper classes for tests as well. If it’s difficult to set up a fixture for a certain kind of test, that might be a code smell in and of itself, but that’s also another story. If we have some test helpers that reside in our dependent modules, we won’t be able to see those helpers in our tests in another module, as we’re only depending on that module’s production code, not test code.

We can easily tell Maven to also bundle up the test code from a certain module, however, and make it available to us in a jar file, like so:


Now when we build, we’ll get a test jar as well as our regular jar, which we can depend on like so:


Now our test classes in the module declaring the above dependency can see the test helpers in the somemodule module – but we’re still not including test code in our production jar.

Again, I have to emphasize that this level of coupling might indicate a deeper issue, but if you do need to do this, it’s good to know how icon smile Automatic Dependency Management with Maven

Maven also includes facilities to analyze and clean up a complex dependency tree, remove unnecessary dependencies, and keep the whole project manageable.

In summary, Maven can handle extremely complex dependency management for us in a fully declarative and versioned manner, allowing us at any moment to see exactly what our project depends on, both in production and test code. In conjunction with a CI system (like TeamCity) and repository server (like Nexus), we can automate the deployment of intermediary and full-release versions to the point where we save significant time, and never build code that we’re not actually working on, allowing us to concentrate on the task at hand and leaving the heavy lifting to Maven.

This allows us to only ever build the code we’re actually changing – never code that’s already available in another library, reducing our developer cycle time significantly. It also means we’re spending more time “assembling” software from re-usable components than re-compiling (and probably re-testing) code that’s already verified and available in object form.

Maven: not just for breakfast anymore.