Docker build protips

Docker build protips

On this very short post I'm going to be mentioning some of the tools you can use to make your Dockerfile s better:

  • su-exec: Switch user and group id, setgroups and exec.
  • dumb-init: Process supervisor and init system designed to run as PID 1 in containers.
  • yarn info: Show information about a package.
  • jq: A lightweight and flexible command-line JSON processor.

su-exec

Sometime in 2014, Github user tianon brought to us gosu, a command used to step down from root in order to execute tasks as a specific user (that is not root).

gosu probably stands for Go Switch-User, because it's written in Go, and performs some of the functionality of the original unix su (switch-user) command.

Now the command su-exec provides the same functionality but for the low, low price of 10kb instead of 1.8MB on your Docker image build.

According to the Github page for su-exec, it:

... is a simple tool that will simply execute a program with different privileges. The program will be exceuted directly and not run as a child, like su and sudo does, which avoids TTY and signal issues (see below).

Notice that su-exec depends on being run by the root user, non-root users do not have permission to change uid/gid.

In my case I was installing a storage plugin for the Ghost CMS and I forgot to switch the user from root to node via the Dockerfile USER directive or su-exec:

cd "$GHOST_INSTALL/current/core/server/adapters/storage"; \
git clone https://github.com/eexit/ghost-storage-cloudinary.git; \

Which led me to have directories labeled as root instead of the intended user node. This means the user that is running in my Docker container node is not going to be able to read anything in the ghost-storage-cloudinary directory:

docker-compose exec ghost sh
...
cd current/core/server/adapters/storage/
/var/lib/ghost/versions/2.23.2/core/server/adapters/storage # ls -la
total 28
drwxr-xr-x    1 node     node          4096 Jun 16 18:53 .
drwxr-xr-x    1 node     node          4096 Jun 16 18:51 ..
-rw-r--r--    1 node     node          5898 Oct 26  1985 LocalFileStorage.js
drwxr-xr-x    6 root     root          4096 Jun 16 18:54 ghost-storage-cloudinary
-rw-r--r--    1 node     node          3605 Oct 26  1985 index.js
-rw-r--r--    1 node     node          1742 Oct 26  1985 utils.js

Using su-exec corrects this error:

	cd "$GHOST_INSTALL/current/core/server/adapters/storage"; \
	su-exec node git clone https://github.com/eexit/ghost-storage-cloudinary.git; \
	cd ghost-storage-cloudinary; \
	rm package-lock.json; \
	su-exec node yarn install --ignore-optional --production --no-lockfile; \
    ```

dumb-init

https://github.com/Yelp/dumb-init:

dumb-init is a simple process supervisor and init system designed to run as PID 1 inside minimal container environments (such as Docker). It is deployed as a small, statically-linked binary written in C.

Lightweight containers have popularized the idea of running a single process or service without normal init systems like systemd or sysvinit. However, omitting an init system often leads to incorrect handling of processes and signals, and can result in problems such as containers which can't be gracefully stopped, or leaking containers which should have been destroyed.

Running Docker containers without a an init or supervisor process  can lead to a scenario where your container process dies and is not properly reaped, becoming a zombie process. If the container supervisor (say, Kubernetes) restarts the process, and the original error condition still exists, subsequent process spawn attempts will end up spawning more undead processes, leading you to have an army of resource eating zombies in your container. Yikes!

I will not go into the gory details of how this happens in this post, since I'm intending to keep this post small and informal. So here's my example implementation:

Install

RUN apk add --no-cache \
      dumb-init \

Before

COPY ./build/ghost/docker-entrypoint.sh /usr/local/bin
ENTRYPOINT ["/usr/local/bin/docker-entrypoint.sh"]
CMD ["node", "current/index.js"]

After

COPY ./build/ghost/docker-entrypoint.sh /usr/local/bin
ENTRYPOINT ["/usr/bin/dumb-init", "--"]
CMD ["/usr/local/bin/docker-entrypoint.sh", "node", "current/index.js"]

Notice the difference between the first and the second process. The first one yarn run gatsby develop is being run directly, while the second one is being run by /usr/bin/dumb-init

docker-compose ps
        Name                       Command               State            Ports          
-----------------------------------------------------------------------------------------
docker_gatsby-ghost_1   yarn run gatsby develop -H ...   Up      127.0.0.1:8000->8000/tcp
docker_ghost_1          /usr/bin/dumb-init -- /usr ...   Up      127.0.0.1:2368->2368/tcp

Yarn info

While I was working with a contrib Dockerfile for Ghost, I noticed that it was switching between using npm and yarn in various places. Mainly for consistency's sake, I wanted to keep everything within one package manager.

You can use the npm view command to view information about locally installed packages:

# scrape the expected version of sqlite3 directly from Ghost itself
&& sqlite3Version="$(npm view . optionalDependencies.sqlite3)" \

The yarn info command provides similar functionality, but unlike npm view, it doesn't provide it in a bash variable friendly format:

yarn info sqlite3 version
yarn info v1.16.0
4.0.9
Done in 0.19s.

The --json flag looks more promising, but it's still not ready for variable consumption:

yarn info sqlite3 version --json
{"type":"inspect","data":"4.0.9"}

The jq utility, however, can take care of that for us.

yarn info sqlite3 version --json | jq '.data'
"4.0.9"

The output though, comes nested in double-quotes, which is not going to be good when trying to concatenate the output into another string.

The --raw-output flag will take care of those quotes:

yarn info sqlite3 version --json | jq --raw-output '.data'
4.0.9

Now we can mix that output into a string:

sqlite3Version=$(yarn info sqlite3 version --json | jq --raw-output '.data')
/var/lib/ghost # echo "[email protected]$sqlite3Version"
[email protected]

Conclusion

And that's it for today's "quick" post, I hope you enjoyed!