Singularity Advanced Topics

Building containers

Building writable containers

By default, the containers you build in Singularity are read-only. Any changes you make are not saved. This usually is not a problem if the container you use has everything you need, since you can save files in your working directory or home directory and they will persist outside of the containers.

However, if your container is missing something that doesn't make sense to include in your home directory that you would like to persist between runs of the container (like another piece of software), you can build a container in a writable directory, called a sandbox.

Tip

An alternative way to have a writable (portion of a) filesystem in Singularity is an overlay. Overlays are files that act like a storage drive you can "plug in" to your container rather than encompassing the entire root filesystem. Since overlays are a viewed as a single file, they are great for "tricking" the HPCC into allowing you to use more files than your quota allows. For more information, including powertools to help you get started and examples installing conda, see the Lab Notebook on Singularity Overlays.

You can create a sandbox using the singularity build command with the --sandbox option. As arguments, use a directory name for the location of the sandbox and an image you want to start with (which can either be a URI or a file):

singularity build --sandbox alpine/  docker://alpine

If you look inside the directory, it looks like the full file system for the container

$ ls alpine
bin  environment  home  media  opt   root  sbin         srv  tmp  var
dev  etc          lib   mnt    proc  run   singularity  sys  usr

To run this image with any Singularity command, pass the directory as the image name:

singularity shell alpine/

However in order to make changes that will persist, you need to use the --writable option. Let's try to install some software:

$ singularity shell --writable alpine/
Singularity> python
/bin/sh: python: command not found
Singularity> apk add --update py3-pip
fetch https://dl-cdn.alpinelinux.org/alpine/v3.17/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.17/community/x86_64/APKINDEX.tar.gz
(1/19) Installing libbz2 (1.0.8-r4)
...
(19/19) Installing py3-pip (22.3.1-r1)
Singularity> python
Python 3.10.11 (main, Apr  6 2023, 01:16:54) [GCC 12.2.1 20220924] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

If we use the alpine/ sandbox again, we'll still have access to Python! The sandbox can be packaged back up into an image file by again using the singularity build:

$ singularity build alpine_with_python.sif alpine/
INFO:    Starting build...
INFO:    Creating SIF file...
INFO:    Build complete: alpine_with_python.sif
$ singularity exec alpine_with_python.sif python --version
Python 3.10.11

However, this method for building new containers does not leave a record of how you changed the base image. For better reproducibility, you should use a Singularity definition file as described in the next section.

Building containers from scratch with Singularity definition files

Warning

Building Singularity files from definition files requires super user permissions. You will need to install Singularity on your local computer to run these steps.

Alternatively, you might prefer building a Docker container and using it in Singularity as discussed below.

To build containers, Singularity uses a Singularity definition file which is similar to a Dockerfile in Docker. We will walk through building a Singularity definition file that creates a comparable image to the one in our Docker tutorial.

We will set up our working directory the same way:

cd ~
mkdir my_first_Singularity_image
cd my_first_Singularity_image

and create the python script hello.py:

hello.py

print("Hello world!")
print("This is my 1st Singularity image!")

Now, create the file Singularity with the content below:

Singularity

Bootstrap: docker
From: alpine

# copy files required to run
%files 
    hello.py /usr/src/my_app/

# install python and pip
%post
    apk add --update py3-pip

# run the application
%runscript
    python3 /usr/src/my_app/hello.py

Now, let's learn the meaning of each section.

The first section means that we will use Alpine Linux as a base image. In fact, the Bootstrap line tells Singularity that we are using the Docker alpine image hosted on Docker Hub. Other options for the Bootstrap line include library for images in Sylab's container library and shub for images on the (archived) Singularity hub.

Bootstrap: docker
From: alpine

The %files section tells Singularity which files in our local directory we want to copy to the container and where we should move them. In this case, we are copying our Python script into /usr/src/my_app/ in the container.

%files 
    hello.py /usr/src/my_app/

The %post section is used to install pip using the Alpine Package Keeper (apk).

%post
    apk add --update py3-pip

Finally, the %runscript section tells the container what command to run when the container is invoked through singularity run.

%runscript
    python3 /usr/src/my_app/hello.py

We can now build the image using the singularity build command. Don't forget that you'll need super user permission!

sudo singularity build my_first_image.sif Singularity

This will create the my_first_image.sif file that you can now run.

$ singularity run my_first_image.sif
Hello world!
This is my 1st Singularity image!

You can now use this singularity image file anywhere you like, including the HPCC.

Migrating from Docker to Singularity

For more information regarding Docker support in Singularity, please see the official documentation.

Direct comparison

Topic	Docker	Singularity
Installation	Local computer only	HPCC and local computer
Privileges	Requires super user privileges	Only requires super user privileges for building images
Compatibility	Docker images	Docker and Singularity images
Images	Cached and managed by Docker	Available as `.sif` files (can also be cached and managed by Singularity)
File sharing	Manually specifying bind mounts (e.g., `-v` option)	Automatically binds useful directories (`$HOME`, `$PWD`, etc.); others can be specified via `--bind` option and through overlay files
Build file	Dockerfile	Singularity definition file
Downloading images	`docker pull <container>`	`singularity pull <uri-prefix>://<container>`
Running	`docker run <container>`	`singularity run <container>.sif`
Running command	`docker run <container> <command>`	`singularity exec <container>.sif <command>`
Interactive shell	`docker -it <container> sh`	`singularity shell <container>.sif`

Converting from Docker images to Singularity images

There are a few ways to use Docker images with Singularity. If the image is publicly available on Docker Hub, it is as easy as using the singularity pull command with a Docker URI. See the example in the Singularity introduction. If you are installing from a private repository on Docker Hub, use the --docker-login flag with singularity pull to authenticate with Docker.

If the Docker image is only available locally (e.g., you are testing local builds and don't want to push to a repository), you have two options. First, you can build a Singularity image directly from a cached Docker image:

sudo singularity build <singularity-image-filename>.sif docker-daemon://<docker-image-name>

Note that this requires Singularity and Docker to be installed on the same system, and requires super user permissions.

The second option is to first archive the Docker image into a tar file, then use this to build the Singularity image:

docker save <docker-image-name> -o docker_image.tar
singularity build <singularity-image-filename>.sif docker-archive://docker_image.tar

Here you could perform the docker save on your local machine, move the docker_image.tar file to the HPCC, and then run the singularity build step on the HPCC since it does not require super user privileges.

A note on permissions

Singularity automatically mounts many system directories in your container, including $HOME and $PWD. When you enter a shell in a Singularity container, you will be in the same directory you started from. You are also logged in as the same user inside the Singularity container as you are on the host when you start the container.

In contrast, a Docker shell usually starts in / as the root user (or some other user). Thus, you may have different permissions in a Docker container that is run in Singularity. This can cause problems if a Docker container expects you to be able to write to directories that your HPCC user will not have access to (like /root).

In these cases, you may have to modify the Dockerfile used to create the Docker image so that anything you need to access is stored in a location accessible to your user.

Using Singularity with MPI and GPUs

If you are running a container that uses MPI, you must use srun -n $SLURM_NTASKS -c $SLURM_CPUS_PER_TASK before the singularity command to make the command aware of all resources allotted. See a template script below.

singularity_mpi.sbatch

#!/bin/bash

# Job name:
#SBATCH --job-name=singularity-test
#
# Number of MPI tasks needed for use case:
#SBATCH --ntasks=18
#
# Number of nodes to split the tasks across:
#SBATCH --nodes=2
#
# Processors per task:
#SBATCH --cpus-per-task=4
#
# Memory per CPU
#SBATCH --mem-per-cpu=1G
#
# Wall clock limit:
#SBATCH --time=30
#
# Standard out and error:
#SBATCH --output=%x-%j.SLURMout

cd <directory containing the singularity image file (.sif)>
srun -n $SLURM_NTASKS -c $SLURM_CPUS_PER_TASK singularity exec <singularity-file>.sif <commands>

To run a container that takes advantage of GPU resources, you can use the --nv flag on any run, exec, or shell singularity commands. Otherwise, use the standard sbatch setup for running any GPU job. An example script that pulls a TensorFlow container and displays the available GPUs is shown below.

singularity_gpu.sbatch

#!/bin/bash

# Job name:
#SBATCH --job-name=singularity-test
#
# Request GPU:
#SBATCH --gpus=v100:1
#
# Memory per CPU
#SBATCH --mem-per-cpu=20G
#
# Wall clock limit:
#SBATCH --time=30
#
# Standard out and error:
#SBATCH --output=%x-%j.SLURMout

singularity pull docker://tensorflow/tensorflow:latest-gpu
singularity exec --nv tensorflow_latest-gpu.sif python -c "from tensorflow.python.client import device_lib; print(device_lib.list_local_devices())"

Cached images

When you use Docker image without pulling it first, it appears that no Singularity image file was created:

$ mkdir new_dir
$ cd new_dir
$ singularity exec docker://ubuntu uname -a
Getting image source signatures
Copying blob a1d0c7532777 done  
...
Linux dev-amd20 3.10.0-1160.80.1.el7.x86_64 #1 SMP Tue Nov 8 15:48:59 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
$ ls
$

In fact, Singularity stores these files and the files used to create them in a cache:

$ singularity cache list
There are 9 container file(s) using 4.06 GiB and 68 oci blob file(s) using 4.18 GiB of space
Total space used: 8.24 GiB

As you can see, the files stored here can build up quickly. You can clean this cache using

singularity cache clean

Everything in the cache can be safely removed, and will just be redownloaded if needed again.

By default, Singularity uses ~/.singularity/cache to store these files. If you want to use another directory (e.g., your scratch space), you can use the SINGULARITY_CACHEDIR environment variable. Singularity also uses a temporary directory (/tmp by default) that you might also want to change using the SINGULARITY_TEMPDIR environment variable. For example:

mkdir -p $SCRATCH/singularity_tmp
mkdir -p $SCRATCH/singularity_scratch
SINGULARITY_CACHEDIR=$SCRATCH/singularity_scratch SINGULARITY_TMPDIR=$SCRATCH/singularity_tmp singularity --debug pull --name ubuntu-tmpdir.sif docker://ubuntu

Using the --debug flag shows a lot of information, but at the end we see these lines:

VERBOSE [U=919141,P=3605]  Full()    Build complete: /mnt/gs21/scratch/<user>/singularity_scratch/cache/oci-tmp/tmp_011111517
DEBUG   [U=919141,P=3605]  cleanUp() Cleaning up "/mnt/gs21/scratch/<user>/singularity_tmp/build-temp-030629912/rootfs" and "/mnt/gs21/scratch/<user>/singularity_tmp/bundle-temp-545905815"

verifying that the scratch directories were used.