12. Best practices¶
12.1. Other best practices information¶
This isn’t the last word. Also consider:
Many of Docker’s Best practices for writing Dockerfiles apply to Charliecloud images as well.
“Recommendations for the packaging and containerizing of bioinformatics software”, Gruening et al. 2019, is a thoughtful editorial with eleven specific containerization recommendations for scientific software.
“Application container security guide”, NIST Special Publication 800-190; Souppaya, Morello, and Scarfone 2017.
12.2. Installing your own software¶
This section covers four situations for making software available inside a Charliecloud container:
Third-party software installed into the image using a package manager.
Third-party software compiled from source into the image.
Your software installed into the image.
Your software stored on the host but compiled in the container.
Note
Maybe you don’t have to install the software at all. Is there already a trustworthy image on Docker Hub you can use as a base?
12.2.1. Third-party software via package manager¶
This approach is the simplest and fastest way to install stuff in your image.
The examples/hello
Dockerfile does this to install the package
openssh-client
:
RUN dnf install -y --setopt=install_weak_deps=false openssh-clients \
&& dnf clean all
COPY . hello
You can use distribution package managers such as dnf
, as demonstrated
above, or others, such as pip
for Python packages. Be aware that the
software will be downloaded anew each time you execute the instruction (unless
you add an HTTP cache, which is out of scope of this documentation).
Note
RPM and friends (yum
, dnf
, etc.) have traditionally been
rather troublesome in containers, and we suspect there are bugs we haven’t
ironed out yet. If you encounter problems, please do file a bug!
12.2.2. Third-party software compiled from source¶
Under this method, one uses RUN
commands to fetch the desired software
using curl
or wget
, compile it, and install. Our example does
this with two chained Dockerfiles. First, we build a basic AlmaLinux image
(examples/Dockerfile.almalinux_8ch
):
FROM almalinux:8 # This image has three purposes: (1) demonstrate we can build a AlmaLinux 8 # image, (2) provide a build environment for Charliecloud EPEL 8 RPMs, and (3) # provide image packages necessary for Obspy and Paraview. # # Quirks: # # 1. Install the dnf ovl plugin to work around RPMDB corruption when # building images with Docker and the OverlayFS storage driver. # # 2. Enable PowerTools repo, because some packages in EPEL depend on it. # # 3. Install packages needed to build el8 rpms. # # 4. Issue #1103: Install libarchive to resolve cmake bug # RUN dnf install -y --setopt=install_weak_deps=false \ epel-release \ 'dnf-command(config-manager)' \ && dnf config-manager --enable powertools \ && dnf install -y --setopt=install_weak_deps=false \ dnf-plugin-ovl \ autoconf \ automake \ gcc \ git \ libarchive \ libpng-devel \ make \ python3 \ python3-devel \ python3-lark-parser \ python3-requests \ python3-sphinx \ python3-sphinx_rtd_theme \ rpm-build \ rpmlint \ rsync \ squashfs-tools \ squashfuse \ wget \ which \ && dnf clean all # Need wheel to install bundled Lark, and the RPM version doesn’t work. RUN pip3 install wheel # AlmaLinux's linker doesn’t search these paths by default; add them because we # will install stuff later into /usr/local. RUN echo "/usr/local/lib" > /etc/ld.so.conf.d/usrlocal.conf \ && echo "/usr/local/lib64" >> /etc/ld.so.conf.d/usrlocal.conf \ && ldconfig # Install ImageMagick # The latest, 7.1.0, fails to install with a cryptic libtool error. ¯\_(ツ)_/¯ ARG MAGICK_VERSION=7.0.11-14 RUN wget -nv -O ImageMagick-${MAGICK_VERSION}.tar.gz \ "https://github.com/ImageMagick/ImageMagick/archive/refs/tags/${MAGICK_VERSION}.tar.gz" \ && tar xf ImageMagick-${MAGICK_VERSION}.tar.gz \ && cd ImageMagick-${MAGICK_VERSION} \ && ./configure --prefix=/usr/local \ && make -j $(getconf _NPROCESSORS_ONLN) install \ && rm -Rf ../ImageMagick-${MAGICK_VERSION} # Add mount points for files and directories for paraview and obspy comparison # tests. RUN mkdir /diff \ && echo "example bind mount file" > /a.png \ && echo "example bind mount file" > /b.png
Then, in a second image (examples/Dockerfile.openmpi
), we add OpenMPI.
This is a complex Dockerfile that compiles several dependencies in addition to
OpenMPI. For the purposes of this documentation, you can skip most of it, but
we felt it would be useful to show a real example.
FROM libfabric
# See Dockerfile.libfabric for MPI goals and details.
# OpenMPI.
#
# Build with PMIx, PMI2, and FLUX-PMI support.
#
# 1. --disable-pty-support to avoid “pipe function call failed when
# setting up I/O forwarding subsystem”.
#
# 2. --enable-mca-no-build=plm-slurm to support launching processes using the
# host’s srun (i.e., the container OpenMPI needs to talk to the host Slurm’s
# PMIx) but prevent OpenMPI from invoking srun itself from within the
# container, where srun is not installed (the error messages from this are
# inscrutable).
ARG MPI_URL=https://www.open-mpi.org/software/ompi/v4.1/downloads
ARG MPI_VERSION=4.1.4
RUN wget -nv ${MPI_URL}/openmpi-${MPI_VERSION}.tar.gz \
&& tar xf openmpi-${MPI_VERSION}.tar.gz
RUN cd openmpi-${MPI_VERSION} \
&& CFLAGS=-O3 \
CXXFLAGS=-O3 \
FLUX_PMI_CFLAGS=-I/usr/local/include/flux/core,-L/usr/local/lib/flux \
FLUX_PMI_LIBS=-lpmi \
./configure --prefix=/usr/local \
--sysconfdir=/mnt/0 \
--with-pmix=/usr/local \
--with-pmi=/usr/local \
--with-flux-pmi-library \
--with-libfabric=/usr/local \
--disable-pty-support \
--enable-mca-no-build=btl-openib,plm-slurm \
&& make -j$(getconf _NPROCESSORS_ONLN) install \
&& rm -Rf ../openmpi-${MPI_VERSION}*
RUN ldconfig
# OpenMPI expects this program to exist, even if it’s not used. Default is
# “ssh : rsh”, but that’s not installed.
RUN echo 'plm_rsh_agent = false' >> /mnt/0/openmpi-mca-params.conf
# Silence spurious pmix error. https://github.com/open-mpi/ompi/issues/7516.
ENV PMIX_MCA_gds=hash
So what is going on here?
Use the latest AlmaLinux 8 as the base image.
Install a basic build system using the OS package manager.
For a few dependencies and then OpenMPI itself:
Download and untar. Note the use of variables to make adjusting the URL and versions easier, as well as the explanation of why we’re not using
dnf
, given that several of these packages are included in CentOS.Build and install OpenMPI. Note the
getconf
trick to guess at an appropriate parallel build.
Clean up, in order to reduce the size of the build cache as well as the resulting Charliecloud image (
rm -Rf
).
12.2.3. Your software stored in the image¶
This method covers software provided by you that is included in the image. This is recommended when your software is relatively stable or is not easily available to users of your image, for example a library rather than simulation code under active development.
The general approach is the same as installing third-party software from
source, but you use the COPY
instruction to transfer files from the
host filesystem (rather than the network via HTTP) to the image. For example,
examples/mpihello/Dockerfile.openmpi
uses this approach:
# ch-test-scope: full
FROM openmpi
# This example
COPY . /hello
WORKDIR /hello
RUN make clean && make
These Dockerfile instructions:
Copy the host directory
examples/mpihello
to the image at path/hello
. The host path is relative to the context directory, which is tarred up and sent to the Docker daemon. Docker builds have no access to the host filesystem outside the context directory.(Unlike HPC, Docker comes from a world without network filesystems. This tar-based approach lets the Docker daemon run on a different node from the client without needing any shared filesystems.)
The usual convention, including for Charliecloud tests and examples, is that the context is the directory containing the Dockerfile in question. A common pattern, used here, is to copy in the entire context.
cd
to/hello
.Compile our example. We include
make clean
to remove any leftover build files, since they would be inappropriate inside the container.
Once the image is built, we can see the results. (Install the image into
/var/tmp
as outlined in the tutorial, if you haven’t already.)
$ ch-run /var/tmp/mpihello-openmpi.sqfs -- ls -lh /hello
total 32K
-rw-rw---- 1 charlie charlie 908 Oct 4 15:52 Dockerfile
-rw-rw---- 1 charlie charlie 157 Aug 5 22:37 Makefile
-rw-rw---- 1 charlie charlie 1.2K Aug 5 22:37 README
-rwxr-x--- 1 charlie charlie 9.5K Oct 4 15:58 hello
-rw-rw---- 1 charlie charlie 1.4K Aug 5 22:37 hello.c
-rwxrwx--- 1 charlie charlie 441 Aug 5 22:37 test.sh
12.2.4. Your software stored on the host¶
This method leaves your software on the host but compiles it in the image. This is recommended when your software is volatile or each image user needs a different version, for example a simulation code under active development.
The general approach is to bind-mount the appropriate directory and then run
the build inside the container. We can re-use the mpihello
image to
demonstrate this.
$ cd examples/mpihello
$ ls -l
total 20
-rw-rw---- 1 charlie charlie 908 Oct 4 09:52 Dockerfile
-rw-rw---- 1 charlie charlie 1431 Aug 5 16:37 hello.c
-rw-rw---- 1 charlie charlie 157 Aug 5 16:37 Makefile
-rw-rw---- 1 charlie charlie 1172 Aug 5 16:37 README
$ ch-run -b .:/mnt/0 --cd /mnt/0 /var/tmp/mpihello.sqfs -- \
make mpicc -std=gnu11 -Wall hello.c -o hello
$ ls -l
total 32
-rw-rw---- 1 charlie charlie 908 Oct 4 09:52 Dockerfile
-rwxrwx--- 1 charlie charlie 9632 Oct 4 10:43 hello
-rw-rw---- 1 charlie charlie 1431 Aug 5 16:37 hello.c
-rw-rw---- 1 charlie charlie 157 Aug 5 16:37 Makefile
-rw-rw---- 1 charlie charlie 1172 Aug 5 16:37 README
A common use case is to leave a container shell open in one terminal for building, and then run using a separate container invoked from a different terminal.