Docker Containers and Builds

The one thing, that you simply cannot go without when running builds, is 100% stability in the build environment. Docker is the obvious tool choice for achieving this in 2016. A few crucial things should be kept in mind when setting these things up though.

Avoid Host-Volumes

Host-Volumes, that is volumes created by mounting a host path into the container are your natural enemy when trying to achieve stability and portability for your build. They introduce the host environment into the container and this is not a theoretical issue at all! Just look at this recent question about running your MySQL database from a host volume and the hoops, I had to advise the author to go through, to make it work. Another classic issue arising from using host volumes is that the container will often run as the root user or at least a different user than your currently logged in user. This means that files created by the container, that you may want to retrieve from the volume later, will belong to root and will have to get their owner changed before you do anything else with them. Again, this requires actions on the host, impeding portability and hence long term stability of the build (reinstalling the host machine, switching cloud provider … you name it!).

The worst part though is, that using host volumes will not work with remote Docker hosts. Only if the machine you’re starting the build from is also running the Docker daemon, will you be able to use host volumes. Otherwise your build steps will need to include explicitly uploading the necessary files to the host before starting the container. Often times cloud providers of Docker daemons like Amazon ECS will not even offer this option.

Lock in Version Numbers

This next point I cannot stress enough. When setting up build environments people often add things like:

RUN apt-get install ruby

into their Dockerfile and it works just fine for them, … until it stops working. Adding a statement like the above to the Dockerfile tells Docker (technically apt in this case …) to install the latest Ruby version available for the Linux distribution used. Especially in the Ruby world this will lead to issues quickly in many cases, as a result of Ruby seemingly often favoring innovation over backwards compatibility. Eventually the ruby version installed by this might jump from 1.8.x to 1.9.x and all of a sudden your build will inexplicably break. This case might be somewhat easy to debug, but let me give you an example of how bad it can get:

A while back I created a build environment for running Selenium based tests against a very complex setup of multiple web applications. It took a while to set things up, but eventually 4 web server containers where being tested in the craziest spots imaginable, still the build was stable over hundreds of runs and me and the client were happy. What I did was specify installing the latest Selenium version to be installed instead of properly locking in the version. A few weeks later the client made some design changes to one of his sites and asked me to adjust the tests after his developers failed to do so and now one test case was failing erroneously. So I rebuild the setup from the Dockerfiles on my Laptop that I had previously deleted it from already to save some disk space and started working on it. To my surprise not only one, but more than half of the tests were failing right off the bat. At first I thought, maybe the client had recently broken something in the development versions of his apps, so I reverted to versions I knew to have passed the tests before. But no change, the same tests still kept breaking. I went at it debugging things, making some adjustments to the Dockerfiles, adjusting the tests, still I could not find the reason for things to have broken in the first place. The errors I saw indicated Selenium clicking the wrong elements where it had previously clicked the right ones thousands if not millions of times. Eventually I got a little tired of my laptop building Docker images a lot slower than the EC2 instance I developed this on initially, so I moved to working on that again. To my surprise tests instantly started passing again on the EC2 instance and then it hit me, the EC2 instance still had the old images. Something in the versions must have changed … I quickly found out Selenium had made a change breaking backward compatibility when it came to hovering elements, I locked in the version using the known to work Selenium version in the Dockerfile and things started working fine on my laptop too.

The moral of the story: Not locking in versions means having to troubleshoot tests and production code simultaneously. You should be able to trust your tests, the easiest way to achieve this is to simply make the build 100% deterministically. Lock in the version of you base Docker image, lock in the versions for everything you put into it and you’ll be fine. Whenever your tests break, you know it’s the code now thanks to your tests being stable and frozen in time.

A Clean Build has Two Or Four Steps

Build Image

The first step of a clean build should simply be sending a build context to the Docker daemon. You can send a tarball compressed by various methods of choice and the daemon will extract it’s contents and build the image. When deciding on how to setup your images you should try your best to keep things homogeneous in terms of paths across your projects to avoid headache with documentation and configuration. An example of how to do this can be found in this skeleton I created for a former employer. You can see a few folders each holding the build context for one container. All of which who require a default command different from their base image simply get this in the form of a run.sh bash script, that is then always added to the final image in the exact same way and at the end of the build via this snippet:

COPY run.sh /run.sh
RUN chmod +x /run.sh
CMD ["/bin/bash", "-lc", "/run.sh"]

Also you should ensure that if your run.sh produces any output, say a .jar, coverage reports or compiled documentation is saved to a specific folder. My suggestion would be /out. Read on to find out why :)

From the commandline building your container now simply becomes:

$ docker build -t username/tagname /path/to/build/context

Run Command

The next step is to run the container we just built. In the commandline this simply turns into running for the case of you only requiring the stdOut by run.sh:

$ docker run --rm -t username/tagname

You May Need More than StdOut!

Oftentimes just getting StdOut means that you get the exit status of your build and it’s direct console output, say the list of failed and passed tests. But what about that nice code coverage html report for example ? Files you will need to retrieve from the container after it exited. Also in this case you cannot use the --rm flag on the container run, given you don’t want to remove but access it once it has stopped.

So our run command in bash would come out to just:

$ docker run -t username/tagname

Also, let’s give the container a name so it’s easier to track the individual container once it stopped and we still want to access it:

$ docker run -t --name container_name username/tagname

Optional: Download Files

Downloading files from a container using the CLI is almost as easy as copying files in a shell using the docker cp command. Assuming whatever your previously created run.sh produced as output went to the /out folder in the container, you can now simply download the contents of it via:

$ docker cp container_name:/out /local/target/path/

Optional: Delete Container

Once you got your build result from the container, all you need to do is delete it. Otherwise it’s file system will still be kept on the Docker daemon host and quickly eat up its disk space. In the commandline you can simply do this by running:

$ docker rm -v container_name

the -v flag should not be forgotten here, since many base images your build could depend on have VOLUME statements in their Dockerfile. If you fail to explicitly delete any volumes that may have been created by these containers, you’ll create dangling volumes eating up disk space and being hard to remove.