Building Better Docker Images
It's been about eight months since I wrote down some of my thoughts on [building good docker images](/journal/building-good-docker-images). The docker ecosystem has continued moving quickly and I'm pleased to say that most of the principles listed in the old article have aged well. I don't have a ton to add, but here's a few things that I've discovered since then: * **Base images off of Alpine** `alpine` is a 5 megabyte image based on [Alpine Linux]( ([Docker hub link]( Despite being small, it comes with a package manager (`apk`) with access to a well-maintained, modern [package repository]( This base image is ideal for building tiny application containers (e.g. a 6 MB [redis image](, a 17 MB [node.js 0.10 image](, a 6 MB [PostgreSQL client]( Alpine presents some challenges if you need to stray beyond the package manager. Alpine uses [musl libc](, so most dynamically-linked "Linux" binaries that you can download off the web will not run. Instead, you may find yourself building from source, in which case it's helpful to know that the Alpine `build-base` package is roughly equivalent to Debian's `build-essential`. Alpine's default shell is `ash`, but you can install `bash` through the package manager if you need or prefer. * **Write tests** I picked this up from the bright guys at [Aptible]( When you want to guarantee that your image has a certain feature set, it can be useful to run a suite of tests *as part of the Dockerfile*. If the image is just for you, this is probably overkill, but if other people are using or building off of your image, or if you want to make things explicit and maintainable (e.g. for others on your development team), it can be very helpful. With tests in the Dockerfile, a compiled and published image comes with guarantees about the image's behavior. If you try to rebuild the image and the tests fail, you get some indication of what's changed since then (typically some external state, as discussed in the previous article). Here's an example of some simple tests from my `jbergknoff/sass` repository ([link]( #!/bin/sh echo --- Tests --- echo -n "it should install sassc 3.2.1... " sass -v | grep sassc | grep "3.2.1" > /dev/null [ "$?" -ne 0 ] && echo nope && exit 1 echo ok echo -n "it should compile SCSS... " echo '$blue: #00f; .thing { color: $blue; }' > /tmp/test.scss sass /tmp/test.scss | grep "color: #00f" > /dev/null [ "$?" -ne 0 ] && echo nope && exit 1 rm /tmp/test.scss echo ok This content is in a file `` which gets `RUN` as part of the Dockerfile. If a test fails, the Dockerfile build fails. If you'd prefer a test runner/framework, consider [bats]( It's a light wrapper around bash scripting, adding some structure for testing (the ability to skip tests, setup/teardown steps, etc.). * **Use scripts** Sometimes it makes sense to break out a part of a Dockerfile into a shell script. For instance, while it's good to clean up after installing a package through a package manager, it can get awkward to have a long `&&` chain in a `RUN` command just to enforce cleanliness. Instead, consider making a script. Here is another example from `jbergknoff/sass`, the `` script ([link]( #!/bin/sh # build apk --update add git build-base git clone cd sassc git clone SASS_LIBSASS_PATH=/sassc/libsass make # install mv bin/sassc /usr/bin/sass # cleanup cd / rm -rf /sassc apk del git build-base apk add libstdc++ # sass binary still needs this because of dynamic linking. rm -rf /var/cache/apk/* This script is responsible for grabbing the SASS source code, building it, and then cleaning up. Obviously the final image (a 9 megabyte SASS application container) shouldn't have git installed, but imagine encoding that entire sequence as one big `RUN` simply to keep git out. It seems incongruous. Because all of the installation and cleanup happens in one `RUN` command, there is no extra bloat hanging around (recall that files introduced in one `RUN` and removed in a subsequent `RUN` are still taking up space in the image). This technique can help all sorts of Dockerfiles, making them cleaner and easier to understand. In the case of building from source, it's almost always beneficial.
comments powered by Disqus