I’m defining “a solution” as the software application or multiple applications under current development that need to work together (and hence be tested in an integrated manner) before release to production.
In my experience, continuous delivery pipelines work extremely well when you have a simple solution with the following convenient characteristics:
All code and configuration is stored in one version control repository (such as Git).
The full solution can be deployed all the way to production without needing to test it in conjunction with other applications/components under development.
The build is quick to run (less than 5 minutes).
The automated tests are quick to run (in minutes).
The automated test coverage is sufficient that the risks associated with releasing software are lower than the benefits.
The first three characteristics are what I call “solution complexity” and what I want to discuss in this post.
Here is a nice simple depiction of an application ticking all the above boxes.
Developers can make changes in one place, know that their change will be fully tested and know that when deployed into the production platform, their application should behave exactly as expected. (I’ve squashed the continuous delivery (CD) pipeline into just one box, but inside it I’d expect to see a succession of code deployments, and automated quality gates.)
But what if the solution is more complex?
What if we fail to meet the first characteristic and our code is in multiple places and possibly not all in version control? This is definitely a common problem, in particular for configuration and data loading scripts. However, this isn’t particularly difficult to solve from a technical perspective. Get everything managed by a version control tool like Git.
Depending on the SCM tool you use, it may not be appropriate to feel obliged to use one repository. If you do use multiple, most continuous integration tools (for example, Jenkins) can be set up in such a way as to support handling builds that consume from multiple repositories. If you are using Git, you can even handle this complexity within your version control repository, such as by using sub-modules.
What if the solution includes multiple applications like the following?
Suddenly our beautiful pipeline metaphor is broken and we have a network of pipelines that need to converge (analogous to fan-in in electronics). This is overwhelmingly the norm and certainly makes things more difficult. We now have to carefully consider how our plumbing is going to work. We need to build what I call an “integrated pipeline.”
Designing an integrated pipeline is all about determining the points of integration (POI)—the first time that testing involves a combination of two or more components. At this point, you need to record the versions of each component so that they are kept consistent for the rest of the pipeline. If you fail to do this, earlier quality gates in the pipeline are invalidated.
In the below example, Applications A and B have their own CD pipelines where they will be deployed to independent test environments and face a succession of independent quality gates. Whenever a version of Application A or B gets to the end of its respective pipeline, instead of going into production, it moves into the Integrated Pipeline and creates a new integrated or composite build number. After this POI, the applications progress toward production in the same pipeline and can only move in sync. In the diagram, version A4 of Application A and version B7 of B have made it into integration build I8. If integration build I8 makes it through the pipeline it will be worthy to progress to production.
Depending on the tool you use for orchestration, there are different solutions for achieving the above. It doesn’t have to be complicated—you are simply aggregating version numbers which can easily be stored together in a text document in any format you like (YAML, POM, JSON etc).
Some people reading this may by now be ready to scream “MICRO SERVICES” at their screens. Micro services are by design independently deploy-able services. The independence is achieved by ensuring that they fulfill and expect to consume strict contract APIs so that integration with other services can be managed and components can be upgraded independently. A convention like SemVer can be adopted to manage change to contract compatibility. If you are implementing micro services and achieving this independence between pipelines, that’s great. Personally, on the one micro services solution I’ve worked on so far, we still opted for an integrated pipeline that operated on an integrated build and produced predictable upgrades to production (we are looking to relax that at some point in the future).
Depending on how you are implementing your automated deployment, you may have deployment automation scripts that live separately from your application code. Obviously we want to use consistent version of these throughout deployments to different environments in the pipeline. Therefore, I strongly advise managing these scripts as a component in the same manner.
What if you are not using a PaaS?
The simplest way to answer this is as follows: build your own PaaS (something I’ve advocated in earlier blogs and called Platform as an Application or PaaA).
If you are not deploying into a fully-managed platform service, you have to care about the version of the environment that you are deploying into. The great thing about building a platform application using infrastructure code is that you can treat the code just like you would for any other application. Give it a pipeline and feed it into the integrated pipeline as soon as there is a code deployment (i.e., probably a fairly early POI).
(Thanks to Tom Kuhlmann for the graphic symbols.)