Readers of this website know that I’m no fan of Docker and prefer to use other container engines. However, I live in the real world, and this world has embraced Docker and loves it to pieces. I mean, get a room, already.
Of course, being someone who enjoys learning about technology, and on top of that a fan of container technologies, I find myself reading the Docker documentation from time to time. And, truth be told, it is quite good.
So, I’m always keen to understand the best practices for a particular piece of technology, especially as promoted by the designers themselves.
To that end, the only docs that I read for this article were the Docker docs. I mean, why would I get some watered-down version from another party, when instead I can go straight to the source? In fact, you should stop reading right now and go to the Docker docs. Why read what I think is important when you can decide for yourself?
But, if you’re still here, read on.
- Non-Privileged User
- Multi-Stage Builds
Hopefully, everyone has heard by now to run a container as a non-privileged user. That is, do not run as root. This is rarely a good idea.
Here, I’m creating a new group and user and then switching to it. Yay.
RUN groupadd \ --gid 1000 noroot && \ useradd \ --create-home \ --home-dir /home/noroot \ --uid 1000 \ --gid 1000 \ noroot ... USER noroot ...
To reduce layers and complexity, avoid switching
USERback and forth frequently.
The ideas in this section aren’t best practices, in my opinion, even though Docker says they are. Rather, they are interesting ways to create containers with the Docker engine when a Dockerfile isn’t present or needed.
Why would you be interested in creating images in the following ways (i.e., dynamically)? Well, who wants to stink up their project repository with a Dockerfile? Absolutely nobody, that’s who. And, as you’ll soon learn, you don’t have to.
So, prepare those pull requests (or, merge requests) to remove the Dockerfiles from your project repositories.
When sending a Dockerfile to
docker build using
stdin, use the hyphen (
-) to denote it (of course, this is a common practice for a lot of Unix tools). Creating an image this way is handy when you’re building them dynamically.
I once did this at a job where I needed to test that software was being built correctly for an image and then usable in a container instance. These used different Linux distributions for its userspace applications and so the Dockerfiles were fairly boilerplate.
So, I created a simple shell script that dynamically created multiple Dockerfiles with a few surgical changes and sent them to the
stdin of the
docker build command. And, I then partied like it was 1999.
Let’s quickly look at some different ways to send a Dockerfile to
$ echo -e 'FROM busybox\nCMD echo "hello world"' | docker build -t hello - $ docker run --rm hello hello world
$ docker build -t hello -<<EOF > FROM busybox > CMD echo "hello world" > EOF $ docker run --rm hello hello world
Redirection and process substitution:
$ docker build -t foo - < <(echo -e 'FROM busybox\nCMD echo "hello world"') $ docker run foo hello world
Ok, that last one was just silly. No one would do that. I don’t think.
Importantly, none of these examples used a build context. Yes, that’s a thing.
Note that trying to
ADDany files to the image will fail when not using a build context.
Let’s now look at the same idea, but with using a build context.
asbits project doesn’t include a Dockerfile, because that would just be silly. But that’s not a problem, as I can still use it as the build context and pass a Dockerfile on-the-fly to
There are a couple of ways to do this remotely:
- a tarball
$ docker build -t asbits -f- https://github.com/btoll/asbits.git <<EOF > FROM debian:bullseye-slim > RUN apt-get update && apt-get install -y build-essential > COPY . ./ > RUN gcc -o asbits asbits.c > ENTRYPOINT ["./asbits"] > EOF $ $ docker images REPOSITORY TAG IMAGE ID CREATED SIZE asbits latest bc827217073f 2 minutes ago 371MB $ $ docker run --rm -it asbits 0xdeadbeef 8 1101 1110 1010 1101 1011 1110 1110 1111
Of course, the files copied into the image layer are from the remote
Docker will do a
git clonebehind the scenes and then pass those download files to the Docker daemon as the build context. This means that you will need to have
gitinstalled on the Docker build machine.
Next, let’s use a tarball as the build context (supports
$ docker build -t btoll/asbits:1.0.0 -f- http://192.168.1.96:8000/asbits-1.0.0.tar.gz <<EOF FROM debian:bullseye-slim RUN apt-get update && apt-get install -y build-essential COPY asbits.[c,h] ./ RUN gcc -o asbits asbits.c ENTRYPOINT ["./asbits"] EOF $ $ docker images REPOSITORY TAG IMAGE ID CREATED SIZE btoll/asbits 1.0.0 44f16aefd686 2 minutes ago 371MB $ $ docker run --rm -it btoll/asbits:1.0.0 0xdeadbeef 8 1101 1110 1010 1101 1011 1110 1110 1111
Lastly, you can always use a local build context. Here, we’ll use everyone’s favorite, the current working directory (
$ docker build -t btoll/asbits:1.0.0 -f- . <<EOF FROM debian:bullseye-slim RUN apt-get update && apt-get install -y build-essential COPY asbits.[c,h] ./ RUN gcc -o asbits asbits.c ENTRYPOINT ["./asbits"] EOF
Exclude any files not needed in the image by using a [
.dockerignore] file. Anyone familiar with its better-known cousin the
.gitignore file will be right at home.
For example, if you’re using
NodeJS, you’d want to add
.dockerignore. Frankly, I’m not even sure
NodeJS is a real thing, but people tell me it is.
Also, really important things you don’t want in your Docker image (and that includes any layer) are things such as certificates and cryptographic keys that commonly use the
PEM format and have a
.pem file extension.
scratch image as the final layer, if possible.
ADDcreate layers. Other instructions create temporary intermediate images, and don’t increase the size of the build.
Since I’m feeling incredibly lazy, I’m just going to copy the example from the docs:
# syntax=docker/dockerfile:1 FROM golang:1.16-alpine AS build # Install tools required for project # Run `docker build --no-cache .` to update dependencies RUN apk add --no-cache git RUN go get github.com/golang/dep/cmd/dep # List project dependencies with Gopkg.toml and Gopkg.lock # These layers are only re-built when Gopkg files are updated COPY Gopkg.lock Gopkg.toml /go/src/project/ WORKDIR /go/src/project/ # Install library dependencies RUN dep ensure -vendor-only # Copy the entire project and build it # This layer is rebuilt when a file changes in the project directory COPY . /go/src/project/ RUN go build -o /bin/project # This results in a single layer image FROM scratch COPY --from=build /bin/project /bin/project ENTRYPOINT ["/bin/project"] CMD ["--help"]
I sort lists out of habit to prevent duplications, and it’s funny that Docker recommends the same. So there. Docker just went up a notch in my book.
It is no longer necessary to combine all labels (key-value pairs) into a single
LABEL instruction. Before version 1.10, this could have resulted in extra layers, but no longer.
apt-get update and
apt-get install in a single
RUN instruction. The reason for this is caching. For instance, if they were in two separate
RUN statements, they would be cached into two separate layers.
Now, suppose that you add another package for installation to the
RUN statement. Consequently, the package may not be able to be found because the first
RUN layer would be retrieved from the cache, and its retrieved indices may not have the information for the new package (or reference an older version) you wish to install. Bummer.
It’s a good idea to use version pinning instead of relying on the latest version of a package. For instance, include a tagged version after the package name, whenever possible:
RUN apt-get update && apt-get install -y \ asbits \ foo-package=2.1.4 \ trivial
APT stores the package information (such as the
InRelease file) that it retrieves when updating (
apt-get update) in the
/var/lib/apt/lists/ directory. This can be deleted to save space in the final image.
The contents of
/var/lib/apt/lists/can be safely deleted as they will be re-downloaded the next time
apt-get updateis invoked.
For those that do a lot of
bash shell scripting, you’re probably used to the
pipefail shell option (
pipefail? From The Set Builtin docs:
[T]he return value of a pipeline is the status of the last command to exit with a non-zero status, or zero if no command exited with a non-zero status.
So, if you run the statement without the
pipefail option set:
RUN wget -O - https://some.site | wc -l > /number
wget invocation could fail, but as long as the
wc did not, you’d get a exit value of 0 indicating a success. So, the error would be swallowed, which everyone knows is no bueno.
What you want to happen is for an error to be raised the first time something in a pipeline, and that’s what
pipefail allows for.
The problem, of course, is that Docker uses the
Bourne shell (
sh) to execute commands in a
RUN instruction, and the
Bourne shell doesn’t support
pipefail. So, you’re going to need to run it in a command that supports it, such as
To do that, the Docker docs suggest that you use the
exec form of
RUN ["/bin/bash", "-c", "set -o pipefail && wget -O - https://some.site | wc -l > /number"]
ENV line creates a new intermediate layer, just like
RUN commands. This means that even if you unset the environment variable in a future layer, it still persists in this layer and its value can be dumped.
If you have multiple files to be copied into an image but different build steps (that is,
RUN stages) rely upon a different files or only a subset of the total number, then break them up into different
RUN steps. This will help prevent some cache invalidations and will lead to faster build times.
COPY asbits.[c,h] ./ RUN gcc asbits.c COPY the_universe.* ./
Will have fewer cache invalidations for the
RUN step than:
COPY the_universe.* asbits.[c,h] ./ RUN gcc asbits.c
Generally speaking, only use
ADD when needing to extract a local tarball into the image. Also, rather than using
ADD to fetch a remote package, use
wget so then you can remove the download, reducing the overall image size.
Docker, in its documentation and in every technological nook and cranny on the Internets, wants people to think that it invented container technologies and the Linux kernel. And Unix. The squeeze play. And Louis CK’s comeback.
Yes, Docker is the Milli Vanilli of container technologies. Girl, you know it’s true.