Skip to content

EasyBuild Tutorial

One of the most complex parts of using an HPCC can often be installing the software you want to use. EasyBuild is a piece of software that helps simplify the process. It compiles well-tested recipes contributed by people installing this software (often on HPCCs just like MSU's) all over the world. It's also what ICER uses to install all of the modules you use on the HPCC!

A warning: while "easy" is in the name, don't ever expect installing software on the HPCC to be easy... But EasyBuild is probably the closest you'll get.

What kind of software do you need?

EasyBuild can help you install all kinds of software, but there are other options that we recommend for things like Python and R. For Python, we recommend Anaconda, and for R we recommend using the built-in install.packages command. EasyBuild will be most helpful if you need to compile a piece of software from scratch that somebody else has created a recipe for.

In this tutorial, we are going to try to install a piece of software that's not already on the HPCC called zfp.

Loading EasyBuild

To get started, we load the EasyBuild module:

1
2
module purge
module load EasyBuild

We now have access to the eb command that does everything you need in EasyBuild as the alias ebS which takes the place of the eb command (with some nice defaults setup).

Configuring EasyBuild

We can first check our global EasyBuild configuration using

input
1
ebS --show-config
output
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
#
#
# Current EasyBuild configuration
# (C: command line argument, D: default value, E: environment variable, F: configuration file)
#
buildpath                 (E) = /tmp/k0068027/EASYBUILD
containerpath             (D) = /mnt/home/k0068027/.local/easybuild/containers
cuda-compute-capabilities (C) = 3.5, 3.7, 7.0
detect-loaded-modules     (C) = purge
group-writable-installdir (C) = True
installpath               (E) = /mnt/home/k0068027
installpath-modules       (E) = /mnt/home/k0068027/modules
installpath-software      (E) = /mnt/home/k0068027/software
job-backend               (C) = Slurm
module-depends-on         (C) = True
optarch                   (E) = GENERIC
rebuild                   (C) = True
repositorypath            (D) = /mnt/home/k0068027/.local/easybuild/ebfiles_repo
robot                     (C) = /mnt/research/helpdesk/EB_Files_4, /opt/software/EasyBuild/4.7.1/easybuild/easyconfigs, /mnt/research/helpdesk/ebfiles
robot-paths               (E) = /mnt/research/helpdesk/EB_Files_4, /opt/software/EasyBuild/4.7.1/easybuild/easyconfigs, /mnt/research/helpdesk/ebfiles
sourcepath                (D) = /mnt/home/k0068027/.local/easybuild/sources
suffix-modules-path       (C) = ''

Whoa! There are a lot of options here! Some of them have been setup by HPCC staff to make your life easier (compare with the output of eb --show-config), and you shouldn't worry about most of them. However, let's highlight the important ones:

installpath

This is the root directory for your software installation. In this case, it's my (k0068027) home directory. Usually the software and modules both fall under this directory, but can be set separately (see the next two options).

installpath-software

This is where all of your software is actually installed. When your installation is finished, you should be able to find it under its name in this directory.

installpath-modules

This is where the module files are stored for your software installations. What's a module file? It's how module load mysoftware works! So after you install something with EasyBuild, you'll have built your own personal module that you can load like anything else on the HPCC!

buildpath

This is where the software is compiled. Usually, keeping it as a /tmp directory is good since it's fast storage on the node for lots of small reading and writing. Once it's built, it gets moved to your installpath-software directory anyways, so it really is temporary.

These are usually good defaults, but you might want to change them. For example, what if you need to install a piece of software in a research space so everyone in your group can access it? Or what if your home directory is filling up, you only need the software temporarily, and are okay installing it into your scratch space?

For the sake of this tutorial, we'll practice by installing the software into our scratch space, but leaving the module files in our home directory.

To change the configuration, we can do it by passing the new value as a command line argument to any ebS command:

input
1
ebS --installpath-software=$SCRATCH/software --show-config
output
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#
#
# Current EasyBuild configuration
# (C: command line argument, D: default value, E: environment variable, F: configuration file)
#
buildpath                 (E) = /tmp/k0068027/EASYBUILD
...
installpath               (E) = /mnt/home/k0068027
installpath-modules       (E) = /mnt/home/k0068027/modules
installpath-software      (C) = /mnt/gs21/scratch/k0068027/software
...

Great! It also tells us that this option was set using a command line argument by the (C). We'll have to make sure to include this option when we actually try to install the software: it's not "set and forget"!

As you can see from the output, there are multiple ways to configure EasyBuild. If you want to make more lasting changes that are "set and forget", try setting up a configuration file.

Finding our EasyConfig

So now that we're happy with and (mostly) understand our ebS --show-config results, we can try finding the recipe for the software we'd like to install. These recipes are called EasyConfigs and there's a good chance that someone has already created one for the software you're trying to install.

The list of EasyConfigs is stored on the HPCC and we can search through it using the -S option of ebS:

input
1
ebS -S zfp
output
1
2
3
4
5
== found valid index for /opt/software/EasyBuild/4.7.1/easybuild/easyconfigs, so using it...
CFGS1=/opt/software/EasyBuild/4.7.1/easybuild/easyconfigs/z/zfp
 * $CFGS1/zfp-0.5.5-GCCcore-10.2.0.eb
 * $CFGS1/zfp-1.0.0-GCCcore-9.3.0.eb
 * $CFGS1/zfp-1.0.0-GCCcore-10.3.0.eb

This tells us that there a few different EasyConfigs available to help us install different versions of zfp under different toolchains.

What is a toolchain?

A toolchain is a set of software dependencies used to install new software. Most often, this is a compiler like GCC or a compiler/MPI pair like GCC and OpenMPI. The most basic toolchains are just single compilers and are labeled using their software version (like GCCcore-10.2.0).

EasyBuild organizes installed modules by toolchain. For example, if you look for the R/4.2.2 module file, it's under /opt/modules/MPI/GCC/11.2.0/OpenMPI/4.1.1/R/4.2.2.lua because it was built using a GCC/OpenMPI toolchain.

Some of these are so commonly used that EasyBuild groups dependency software into larger toolchains like foss and intel that contain a compiler/MPI pair and a number of other common dependencies. These are labeled by their year and an a or b for the first or second half of the year. You can check what's in them by searching for their EasyConfig and showing it with eb --show-ec:

input
1
ebF foss
output
1
2
3
4
5
6
7
8
...
====== $ebF_PATH/f/foss/
foss-2016.04.eb  foss-2016b.eb    foss-2018b.eb  foss-2021a.eb    foss-2022b.eb
foss-2016.06.eb  foss-2017a.eb    foss-2019a.eb  foss-2021b.eb
foss-2016.07.eb  foss-2017b.eb    foss-2019b.eb  foss-2022.05.eb
foss-2016.09.eb  foss-2018.08.eb  foss-2020a.eb  foss-2022.10.eb
foss-2016a.eb    foss-2018a.eb    foss-2020b.eb  foss-2022a.eb
...
input
1
eb --show-ec foss-2022a.eb
output
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
easyblock = 'Toolchain'

name = 'foss'
version = '2022a'

homepage = 'https://easybuild.readthedocs.io/en/master/Common-toolchains.html#foss-toolchain'
description = """GNU Compiler Collection (GCC) based compiler toolchain, including
 OpenMPI for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW and ScaLAPACK."""

toolchain = SYSTEM

local_gccver = '11.3.0'

# toolchain used to build foss dependencies
local_comp_mpi_tc = ('gompi', version)

# we need GCC and OpenMPI as explicit dependencies instead of gompi toolchain
# because of toolchain preparation functions
dependencies = [
    ('GCC', local_gccver),
    ('OpenMPI', '4.1.4', '', ('GCC', local_gccver)),
    ('FlexiBLAS', '3.2.0', '', ('GCC', local_gccver)),
    ('FFTW', '3.3.10', '', ('GCC', local_gccver)),
    ('FFTW.MPI', '3.3.10', '', local_comp_mpi_tc),
    ('ScaLAPACK', '2.2.0', '-fb', local_comp_mpi_tc),
]

moduleclass = 'toolchain'

We can see that foss includes GCC, OpenMPI, FlexiBLAS, FFTW, FFTW.MPI, and ScaLAPACK.

We'll try to install the newest version of zfp with the newest compiler there's a corresponding EasyConfig for: zfp-1.0.0-GCCcore-10.3.0.eb.

Checking dependencies

One of the great part of EasyBuild is that it will handle the dependencies of the software you're using for you. If they're not already installed, it will use other EasyConfigs to install them.

We can check to see if we're missing any of zfp's dependencies on the system using ebS -M:

input
1
ebS -M zfp-1.0.0-GCCcore-10.3.0.eb
output
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
...
7 out of 16 required modules missing:

* help2man/1.48.3-GCCcore-10.3.0 (help2man-1.48.3-GCCcore-10.3.0.eb)
* M4/1.4.18-GCCcore-10.3.0 (M4-1.4.18-GCCcore-10.3.0.eb)
* zlib/1.2.11-GCCcore-10.3.0 (zlib-1.2.11-GCCcore-10.3.0.eb)
* Bison/3.7.6-GCCcore-10.3.0 (Bison-3.7.6-GCCcore-10.3.0.eb)
* flex/2.6.4-GCCcore-10.3.0 (flex-2.6.4-GCCcore-10.3.0.eb)
* binutils/2.36.1-GCCcore-10.3.0 (binutils-2.36.1-GCCcore-10.3.0.eb)
* zfp/1.0.0-GCCcore-10.3.0 (zfp-1.0.0-GCCcore-10.3.0.eb)
...

So we're missing seven dependencies; the rest are already available on the HPCC. EasyBuild will install those for us, so long as we use the --robot option when installing zfp. This is option included in the ebS alias by default, so you won't need to worry about it.

"It's already on the HPCC, why is it missing!"

Unfortunately, since some HPCC modules are behind other "gateway" modules (e.g., to load R, you first have to load GCC and OpenMPI), they are unavailable as dependencies of user installs. If the software you try to install needs something behind one of these gateways as a dependency, EasyBuild will install another copy for you.

Installing

Now we're ready to install. We just use the ebS alias with our EasyConfig, and hope things go well!

We can run shorter installs like this on a development node. By default, EasyBuild will try to parallelize compilation using all of the cores on the machine. To be a good dev node neighbor, we can use the --parallel option to only use a few of the cores, and leave the rest of the machine useable for everyone. For longer builds, you might consider running EasyBuild through a batch or interactive job.

And don't forget to change your software install path!

input
1
ebS --parallel=8 --installpath-software=$SCRATCH/software zfp-1.0.0-GCCcore-10.3.0.eb
output
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
...
== processing EasyBuild easyconfig /mnt/ufs18/home-237/k0068027/zfp-1.0.0-GCCcore-10.3.0.eb
== building and installing zfp/1.0.0-GCCcore-10.3.0...
== fetching files...
== creating build dir, resetting environment...
== unpacking...
== patching...
== preparing...
== ... (took 1 secs)
== configuring...
== building...
== ... (took 5 secs)
== testing...
== installing...
== taking care of extensions...
== restore after iterating...
== postprocessing...
== sanity checking...
== cleaning up...
== creating module...
== permissions...
== packaging...
== COMPLETED: Installation ended successfully (took 8 secs)
== Results of the build can be found in the log file(s) /mnt/gs21/scratch/k0068027/software/zfp/1.0.0-GCCcore-10.3.0/easybuild/easybuild-zfp-1.0.0-20230713.150451.log
...

This process should take about five minutes in total.

There's a lot of output here, but we can see that it completed steps like configuring, building, testing, installing, and checking that important files are where they're supposed to be. And not only did it do this for zfp, but for all of the dependencies too. You would usually have to do this all manually!

We can check that the software is where it's supposed to be.

input
1
ls $SCRATCH/software/zfp/1.0.0-GCCcore-10.3.0
output
1
bin  easybuild  include  lib  lib64
input
1
ls $SCRATCH/software/zfp/1.0.0-GCCcore-10.3.0/bin
output
1
testzfp  zfp

Note that the installation is stored under the zfp directory in a directory labeled with the software and toolchain versions. If, in the future, we wanted a new version or one built with a different compiler, these two versions can coexist in different directories.

Using the software

Now that it's installed, we can try to test it with the testzfp executable. We know exactly where it is, so let's try and run it from it's installation directory:

input
1
$SCRATCH/software/zfp/1.0.0-GCCcore-10.3.0/bin/testzfp
output
1
2
/mnt/gs21/scratch/k0068027/software/zfp/1.0.0-GCCcore-10.3.0/bin/testzfp: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by /mnt/gs21/scratch/k0068027/software/zfp/1.0.0-GCCcore-10.3.0/bin/testzfp)
/mnt/gs21/scratch/k0068027/software/zfp/1.0.0-GCCcore-10.3.0/bin/testzfp: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /mnt/gs21/scratch/k0068027/software/zfp/1.0.0-GCCcore-10.3.0/bin/testzfp)

Well that doesn't look right... It looks there are some missing libraries. The real issue is that we didn't load the module for the new software we built! That would have also loaded all the dependencies for us!

Remember that modules are installed into $HOME/modules. We can add these to our "module path" so that they show up when we try to do module load:

input
1
2
module use $HOME/modules
echo $MODULEPATH
output
1
/mnt/home/k0068027/modules:/opt/software/hpcc/modules:/opt/modules/Core

Let's try searching for it:

input
1
module spider zfp
output
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
----------------------------------------------------------------------------
  zfp: zfp/1.0.0-GCCcore-10.3.0
----------------------------------------------------------------------------
    Description:
      zfp is a compressed format for representing multidimensional
      floating-point and integer arrays. zfp provides compressed-array
      classes that support high throughput read and write random access to
      individual array elements. zfp also supports serial and parallel
      (OpenMP and CUDA) compression of whole arrays, e.g., for applications
      that read and write large data sets to and from disk.


     Other possible modules matches:
        lib/zfp

    This module can be loaded directly: module load zfp/1.0.0-GCCcore-10.3.0
...

Notice that the version is followed by the toolchain it depends on. Now we can load it using the command module spider gave.

input
1
2
module load zfp/1.0.0-GCCcore-10.3.0
module list
output
1
2
Currently Loaded Modules:
  1) EasyBuild/4.6.2   2) GCCcore/10.3.0   3) zfp/1.0.0-GCCcore-10.3.0

This does a few things including

  • adding zfp's bin directory to our path,
  • adding zfp's lib directory to our LD_LIBRARY_PATH (so we can compile new programs using the libraries it provides in the future),
  • and loading the modules for its runtime dependencies.

This means we can run testzfp without using it's absolute path:

input
1
testzfp
output
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
zfp version 1.0.0 (August 1, 2022)
library version 4096
CODEC version 5
data model LP64

testing 1D array of floats
  compress:   rate= 2                                                 OK 
  decompress: rate= 2 1.626e+01 <= 1.627e+01                          OK 
...
all tests passed

Though this was a small example, this workflow should get you through most EasyBuild installations. Checkout our EasyBuild reference in the future if you need a quick refresher of the most important commands.

(Oh, and you can delete the $HOME/modules and $SCRATCH/software directories to start fresh for your real installations.)