Inject files from the host into an image directory, with various magic.
$ ch-fromhost [OPTION ...] [FILE_OPTION ...] IMGDIR
This command is experimental. Features may be incomplete and/or buggy. Please report any issues you find, so we can fix them!
Inject files from the host into the Charliecloud image directory
The purpose of this command is to inject arbitrary host files into a container necessary to access host specific resources; usually GPU or proprietary interconnets. It is not a general copy-to-image tool; see further discussion on use cases below.
It should be run after:code:ch-convert and before
invocation, the image is no longer portable to other hosts.
Injection is not atomic; if an error occurs partway through injection, the image is left in an undefined state and should be re-unpacked from storage. Injection is currently implemented using a simple file copy, but that may change in the future.
Arbitrary file and libfabric injection are handled differently.
5.2.1. Arbitrary files¶
Arbitrary file paths that contain the strings
/sbin are assumed to be executables and placed in
within the container. Paths that are not loadable libfabric providers and
contain the strings
.so are assumed to be shared
libraries and are placed in the first-priority directory reported by
--lib-path below). Other files are placed in the
directory specified by
If any shared libraries are injected, run
ldconfig inside the
ch-run -w) after injection.
MPI implementations have numerous ways of communicating messages over interconnects. We use libfabric (OFI), an OpenFabric framework that exports fabric communication services to applications, to manage these communcations with built-in, or loadable, fabric providers.
Using OFI, we can (a) uniformly manage fabric communcation services for both OpenMPI and MPICH, and (b) use simplified methods of accessing proprietary host hardware, e.g., Cray’s Gemini/Aries and Slingshot (CXI).
OFI providers implement the application facing software interfaces needed to
access network specific protocols, drivers, and hardware. Loadable providers,
i.e., compiled OFI libraries that end in
-fi.so, for example, Cray’s
libgnix-fi.so, can be copied into, and used, by an image with a MPI
configured against OFI. Alternatively, the image’s
be overwritten with the host’s. See details and quirks below.
5.3.1. To specify which files to inject¶
Inject files listed in the standard output of command
Inject files listed in the file
Inject the file at
Inject cray-libfabric for slingshot. This is equivalent to
--path $CH_FROMHOST_OFI_CXI, where
$CH_FROMHOST_OFI_CXIis the path the Cray host libfabric
Inject cray gemini/aries GNI libfabric provider
libgnix-fi.so. This is equivalent to
--fi-provider $CH_FROMHOST_OFI_GNI, where
CH_FROMHOST_OFI_GNIis the path to the Cray host ugni provider
libnvidia-container) to find executables and libraries to inject.
These can be repeated, and at least one must be specified.
5.3.2. To specify the destination within the image¶
Place files specified later in directory
IMGDIR/DST, overriding the inferred destination, if any. If a file’s destination cannot be inferred and
--desthas not been specified, exit with an error. This can be repeated to place files in varying destinations.
5.3.3. Additional arguments¶
Print the guest destination path for libfabric providers and replacement.
Print the guest destination path for shared libraries inferred as described above.
ldconfigeven if we appear to have injected shared libraries.
Print help and exit.
List the injected files.
Print version and exit.
5.4. When to use
This command does a lot of heuristic magic; while it can copy arbitrary files into an image, this usage is discouraged and prone to error. Here are some use cases and the recommended approach:
I have some files on my build host that I want to include in the image. Use the
COPYinstruction within your Dockerfile. Note that it’s OK to build an image that meets your specific needs but isn’t generally portable, e.g., only runs on specific micro-architectures you’re using.
I have an already built image and want to install a program I compiled separately into the image. Consider whether a building a new derived image with a Dockerfile is appropriate. Another good option is to bind-mount the directory containing your program at run time. A less good option is to
cp(1)the program into your image, because this permanently alters the image in a non-reproducible way.
I have some shared libraries that I need in the image for functionality or performance, and they aren’t available in a place where I can use
COPY. This is the intended use case of
ch-fromhost. You can use
--pathto put together a custom solution. But, please consider filing an issue so we can package your functionality with a tidy option like
5.5. Libfabric usage and quirks¶
The implementation of libfabric provider injection and replacement is experimental and has a couple quirks.
Containers must have the following software installed:
libfabric (https://ofiwg.github.io/libfabric/). See
Corresponding open source MPI implementation configured and built against the container libfabric, e.g., - MPICH, or - OpenMPI. See
At run time, a libfabric provider can be specified with the variable
FI_PROVIDER. The path to search for shared providers can be specified with
FI_PROVIDER_PATH. These variables can be inherited from the host or explicitly set with the container’s environment file
To avoid issues and reduce complexity, the inferred injection destination for libfabric providers and replacement will always at the path in the image where
The Cray GNI loadable provider,
libgnix-fi.so, will link to compiler(s) in the programming environment by default. For example, if it is built under the
PrgEnv-intelprogramming environment, it will have links to files at paths
ch-runwill not bind automatically.
Managing all possible bind mount paths is untenable. Thus, this experimental implementation injects libraries linked to a
libgnix-fi.sobuilt with the minimal modules necessary to compile, i.e.:
A Cray GNI provider linked against more complicated PE’s will still work, assuming 1) the user explicitly bind-mounts missing libraries listed from its
lddoutput, and 2) all such libraries do not conflict with container functionality, e.g.,
At the time of this writing, a Cray Slingshot optimized provider is not available; however, recent libfabric source acitivity indicates there may be at some point, see: https://github.com/ofiwg/libfabric/pull/7839We.
For now, on Cray systems with Slingshot, CXI, we need overwrite the container’s
libfabric.sowith the hosts using
--path. See examples for details.
Tested only for C programs compiled with GCC. Additional bind mount or kludging may be needed for untested use cases. If you’d like to use another compiler or programming environment, please get in touch so we can implement the necessary support.
Please file a bug if we missed anything above or if you know how to make the code better.
Symbolic links are dereferenced, i.e., the files pointed to are injected, not the links themselves.
As a corollary, do not include symlinks to shared libraries. These will be
There are two alternate approaches for nVidia GPU libraries:
ch-runand call the library functions directly. However, this would mean that Charliecloud would either (a) need to be compiled differently on machines with and without nVidia GPUs or (b) have
libnvidia-containersavailable even on machines without nVidia GPUs. Neither of these is consistent with Charliecloud’s philosophies of simplicity and minimal dependencies.
nvidia-container-cli configureto do the injecting. This would require that containers have a half-started state, where the namespaces are active and everything is mounted but
pivot_root(2)has not been performed. This is not feasible because Charliecloud has no notion of a half-started container.
Further, while these alternate approaches would simplify or eliminate this script for nVidia GPUs, they would not solve the problem for other situations.
File paths may not contain colons or newlines.
ldconfig tends to print
stat errors; these are typically
non-fatal and occur when trying to probe common library paths. See issue #732.
Cray Slingshot CXI injection.
Replace image libabfric, i.e.,
libfabric.so, with Cray host’s
libfabric at host path
$ ch-fromhost -v --path /opt/cray-libfabric/lib64/libfabric.so /tmp/ompi [ debug ] queueing files [ debug ] cray libfabric: /opt/cray-libfabric/lib64/libfabric.so [ debug ] searching image for inferred libfabric destiation [ debug ] found /tmp/ompi/usr/local/lib/libfabric.so [ debug ] adding cray libfabric libraries [ debug ] skipping /lib64/libcom_err.so.2 [...] [ debug ] queueing files [ debug ] shared library: /usr/lib64/libcxi.so.1 [ debug ] queueing files [ debug ] shared library: /usr/lib64/libcxi.so.1.2.1 [ debug ] queueing files [ debug ] shared library: /usr/lib64/libjson-c.so.3 [ debug ] queueing files [ debug ] shared library: /usr/lib64/libjson-c.so.3.0.1 [...] [ debug ] queueing files [ debug ] shared library: /usr/lib64/libssh.so.4 [ debug ] queueing files [ debug ] shared library: /usr/lib64/libssh.so.4.7.4 [...] [ debug ] inferred shared library destination: /tmp/ompi//usr/local/lib [ debug ] injecting into image: /tmp/ompi/ [ debug ] mkdir -p /tmp/ompi//var/lib/hugetlbfs [ debug ] mkdir -p /tmp/ompi//var/spool/slurmd [ debug ] echo '/usr/lib64' >> /tmp/ompi//etc/ld.so.conf.d/ch-ofi.conf [ debug ] /opt/cray-libfabric/lib64/libfabric.so -> /usr/local/lib (inferred) [ debug ] /usr/lib64/libcxi.so.1 -> /usr/local/lib (inferred) [ debug ] /usr/lib64/libcxi.so.1.2.1 -> /usr/local/lib (inferred) [ debug ] /usr/lib64/libjson-c.so.3 -> /usr/local/lib (inferred) [ debug ] /usr/lib64/libjson-c.so.3.0.1 -> /usr/local/lib (inferred) [ debug ] /usr/lib64/libssh.so.4 -> /usr/local/lib (inferred) [ debug ] /usr/lib64/libssh.so.4.7.4 -> /usr/local/lib (inferred) [ debug ] running ldconfig [ debug ] ch-run -w /tmp/ompi/ -- /sbin/ldconfig [ debug ] validating ldconfig cache done
Same as above, except also inject Cray’s
fi_info to verify Slingshot
$ ch-fromhost -v --path /opt/cray/libfabric/220.127.116.11/lib64/libfabric.so \ -d /usr/local/bin \ --path /opt/cray/libfabric/18.104.22.168/lib64/libfabric.so \ /tmp/ompi [...] $ ch-run /tmp/ompi/ -- fi_info -p cxi provider: cxi fabric: cxi [...] type: FI_EP_RDM protocol: FI_PROTO_CXI
Cray GNI shared provider injection.
Add Cray host built GNI provider
libgnix-fi.so to the image and verify
$ ch-fromhost -v --path /home/ofi/libgnix-fi.so /tmp/ompi [ debug ] queueing files [ debug ] libfabric shared provider: /home/ofi/libgnix-fi.so [ debug ] searching /tmp/ompi for libfabric shared provider destination [ debug ] found: /tmp/ompi/usr/local/lib/libfabric.so [ debug ] inferred provider destination: //usr/local/lib/libfabric [ debug ] injecting into image: /tmp/ompi [ debug ] mkdir -p /tmp/ompi//usr/local/lib/libfabric [ debug ] mkdir -p /tmp/ompi/var/lib/hugetlbfs [ debug ] mkdir -p /tmp/ompi/var/opt/cray/alps/spool [ debug ] mkdir -p /tmp/ompi/opt/cray/wlm_detect [ debug ] mkdir -p /tmp/ompi/etc/opt/cray/wlm_detect [ debug ] mkdir -p /tmp/ompi/opt/cray/udreg [ debug ] mkdir -p /tmp/ompi/opt/cray/xpmem [ debug ] mkdir -p /tmp/ompi/opt/cray/ugni [ debug ] mkdir -p /tmp/ompi/opt/cray/alps [ debug ] echo '/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf [ debug ] echo '/opt/cray/alps/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf [ debug ] echo '/opt/cray/udreg/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf [ debug ] echo '/opt/cray/ugni/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf [ debug ] echo '/opt/cray/wlm_detect/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf [ debug ] echo '/opt/cray/xpmem/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf [ debug ] echo '/usr/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf [ debug ] /home/ofi/libgnix-fi.so -> //usr/local/lib/libfabric (inferred) [ debug ] running ldconfig [ debug ] ch-run -w /tmp/ompi -- /sbin/ldconfig [ debug ] validating ldconfig cache done $ ch-run /tmp/ompi -- fi_info -p gni provider: gni fabric: gni [...] type: FI_EP_RDM protocol: FI_PROTO_GNI
Place shared library
/usr/lib64/libfoo.so at path
/usr/lib is the first directory
searched by the dynamic loader in the image), within the image
/var/tmp/baz and executable
/bin/bar at path
/usr/bin/bar. Then, create appropriate symlinks to
$ cat qux.txt /bin/bar /usr/lib64/libfoo.so $ ch-fromhost --file qux.txt /var/tmp/baz
Same as above:
$ ch-fromhost --cmd 'cat qux.txt' /var/tmp/baz
Same as above:
$ ch-fromhost --path /bin/bar --path /usr/lib64/libfoo.so /var/tmp/baz
Same as above, but place the files into
/corge instead (and the shared
library will not be found by
$ ch-fromhost --dest /corge --file qux.txt /var/tmp/baz
Same as above, and also place file
within the container:
$ ch-fromhost --file qux.txt --dest /etc --path /etc/quux /var/tmp/baz
Inject the executables and libraries recommended by nVidia into the image, and
$ ch-fromhost --nvidia /var/tmp/baz asking ldconfig for shared library destination /sbin/ldconfig: Can’t stat /libx32: No such file or directory /sbin/ldconfig: Can’t stat /usr/libx32: No such file or directory shared library destination: /usr/lib64//bind9-export injecting into image: /var/tmp/baz /usr/bin/nvidia-smi -> /usr/bin (inferred) /usr/bin/nvidia-debugdump -> /usr/bin (inferred) /usr/bin/nvidia-persistenced -> /usr/bin (inferred) /usr/bin/nvidia-cuda-mps-control -> /usr/bin (inferred) /usr/bin/nvidia-cuda-mps-server -> /usr/bin (inferred) /usr/lib64/libnvidia-ml.so.460.32.03 -> /usr/lib64//bind9-export (inferred) /usr/lib64/libnvidia-cfg.so.460.32.03 -> /usr/lib64//bind9-export (inferred) [...] /usr/lib64/libGLESv2_nvidia.so.460.32.03 -> /usr/lib64//bind9-export (inferred) /usr/lib64/libGLESv1_CM_nvidia.so.460.32.03 -> /usr/lib64//bind9-export (inferred) running ldconfig
This command was inspired by the similar Shifter feature
that allows Shifter containers to use the Cray Aries network. We particularly
appreciate the help provided by Shane Canon and Doug Jacobsen during our
We appreciate the advice of Ryan Olson at nVidia on implementing