We typically follow a few essential steps to package Java-based microservices (based on Dropwizard, in this case) into a Docker container.
This is typically achieved by copying all classes of a project into one JAR, plus adding meta-information about where to start. More advanced applications include re-location and stripping of classes, pioneered by the Maven Shade Plugin now used throughout the ecosystem and in all build tools.
Let’s go Docker
Packaging this up in a Docker image is very straightforward: Just add the JAR file, some base configuration, an entry-point and you are done. To illustrate, let’s start out with a bare-bones OpenJDK based JRE base container:
While this is ok, it has a serious drawback:
Your big fat JAR (we see common sizes ranging up to 80 megabytes) is one layer in the Docker image;
Hence, every small re-build in your CI leads to a big new layer that needs to be up- and re-downloaded;
This not only slows the build (shading all those JARs takes time) but also makes the entire just run the container kind of lame.
Most of the time, however, only the code itself changes (= a few kByte JAR after build) and dependencies stay stable.
Smarter Java-based Docker images
Instead, try this layering:
Base layer(s) (fully cached till JRE / Alpine changes);
One layer with dependencies (mostly cached unless dependencies change);
Then comes the tiny layer with the freshly built-code (new every time).
The tricky part here is that the dependency layer will always be re-built in the CI chain, and we need to ensure that the hash value stays the same. After much experimentation and cross-platform (OS X behaved differently than Linux), we came up with the following strategy:
Get dependencies dumped into a directory
This is straightforward in Maven: Just call the dependency plugin (no POM modification needed). Preventing snapshot updates ensures the same dependencies are used as downloaded by the preceding build within the CI chain.
The result is a (big) set of files in target/dependency. When just adding these files into Docker, we saw new hash values (and hence new image layers), probably triggered through metadata and/or sorting of files while copying.
Including dependencies as stable image layer
The final approach we came up with (and works robustly on OS X and Linux) is to TAR the files together with defined sorting, reset metadata and then add to the Docker image. So, starting with the sorted TAR, moving it into main directory, re-setting its meta and doing an MD5 (so we can compare between builds):
Then, we modify the Docker file as follows. The ADD command will unpack the files and these land in a dependency directory. Please note that we now have the dependencies in a separate directory and need to launch Java with the classic classpath syntax, hence we need to add the main class:
The final result is a slim and, more importantly, fast-updating Docker container, because the dependency layers will already be cached in most situations. These dependencies could be further grouped; for example, we could build an external and internal dependencies layer to take this idea even further.
To make the Docker container really nice, please remember to add the usual boilerplate:
Labels referring to the Git commit that was built and version of Java package;
Base things like dumb-init to combat possible Zombie processes (which we have not yet seen with the OpenJDK) and especially su-exec to switch to a non-root user (ideally the name of your app for easy identification in ps).