EasyBuild Tutorial
One of the most complex parts of using an HPCC can often be installing the software you want to use. EasyBuild is a piece of software that helps simplify the process. It compiles well-tested recipes contributed by people installing this software (often on HPCCs just like MSU's) all over the world. It's also what ICER uses to install all of the modules you use on the HPCC!
A warning: while "easy" is in the name, don't ever expect installing software on the HPCC to be easy... But EasyBuild is probably the closest you'll get.
What kind of software do you need?
EasyBuild can help you install all kinds of software, but there are other
options that we recommend for things like Python and R. For Python, we
recommend Conda, and for R we recommend using the
built-in install.packages
command. EasyBuild will be most helpful if you
need to compile a piece of software from scratch that somebody else has
created a recipe for.
In this tutorial, we are going to try to install a piece of software that's not already on the HPCC called zfp
.
Loading EasyBuild
To get started, we load the EasyBuild module:
1 2 |
|
We now have access to the eb
command that does everything you need in
EasyBuild.
Configuring EasyBuild
We can first check our global EasyBuild configuration using
input | |
---|---|
1 |
|
output | |
---|---|
1 2 3 4 5 6 7 8 9 10 |
|
Let's highlight the most important ones:
installpath
-
This is the root directory for your software installation. In this case, it's inside my (
k0068027
) home directory in the hidden directory.local/easybuild
. By default, the software and modules both fall under this directory, but can be set separately (see the next two options). installpath-software
-
This is where all of your software is actually installed. When your installation is finished, you should be able to find it under its name in this directory.
installpath-modules
-
This is where the module files are stored for your software installations. What's a module file? It's how
module load mysoftware
works! So after you install something with EasyBuild, you'll have built your own personal module that you can load like anything else on the HPCC! buildpath
-
This is where the software is compiled. Usually, setting it as a
/tmp
directory is good since it's fast storage on the node for lots of small reading and writing. Once it's built, it gets moved to yourinstallpath-software
directory anyways, so it really is temporary.
These are usually good defaults, but you might want to change them. For example, what if you need to install a piece of software in a research space so everyone in your group can access it? Or what if your home directory is filling up, you only need the software temporarily, and are okay installing it into your scratch space?
For the sake of this tutorial, we'll practice by using a directory in /tmp
as the build directory, installing the software into our scratch space, and leaving the module files in our home directory.
To change the configuration, we can do it by passing the new value as a command line argument to any ebS
command:
input | |
---|---|
1 |
|
output | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 |
|
Great! It also tells us that the options were set using a command line argument
by the (C)
. We'll have to make sure to include this option when we actually
try to install the software: it's not "set and forget"!
As you can see from the output, there are multiple ways to configure EasyBuild. If you want to make more lasting changes that are "set and forget", try setting up a configuration file.
Finding our EasyConfig
So now that we're happy with and (mostly) understand our eb --show-config
results, we can try finding the recipe for the software we'd like to install.
These recipes are called EasyConfigs and there's a good chance that someone
has already created one for the software you're trying to install.
The list of EasyConfigs is stored on the HPCC and we can search through it
using the -S
option of ebS
:
input | |
---|---|
1 |
|
output | |
---|---|
1 2 3 4 5 6 |
|
This tells us that there a few different EasyConfigs available to help us install
different versions of zfp
under different toolchains.
What is a toolchain?
A toolchain is a set
of software dependencies used to install new software. Most often, this is
a compiler like GCC or a compiler/MPI pair like GCC and OpenMPI. The most
basic toolchains are just single compilers and are labeled using their
software version (like GCCcore-12.3.0
).
Some of these are so commonly used that EasyBuild groups dependency
software into larger toolchains like
foss
and
intel
that contain a compiler/MPI pair and a number of other common dependencies.
These are labeled by their year and an a
or b
for the first or second
half of the year. You can check what's in them by searching for their
EasyConfig and showing it with eb --show-ec
:
input | |
---|---|
1 |
|
output | |
---|---|
1 2 3 4 5 6 7 8 9 10 |
|
input | |
---|---|
1 |
|
output | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
|
We can see that foss
includes GCC
, OpenMPI
, FlexiBLAS
, FFTW
,
FFTW.MPI
, and ScaLAPACK
.
We'll try to install the newest version of zfp
with the newest compiler
there's a corresponding EasyConfig for: zfp-1.0.1-GCCcore-12.3.0.eb
.
Checking dependencies
One of the great part of EasyBuild is that it will handle the dependencies of the software you're using for you. If they're not already installed, it will use other EasyConfigs to install them.
We can check to see if we're missing any of zfp
's dependencies on the system
using eb -M
:
input | |
---|---|
1 |
|
output | |
---|---|
1 2 3 4 5 |
|
We're only missing one module: the software itself! The remaining twelve
modules are already available on the HPCC. If there were other dependencies
that weren't installed, EasyBuild will install those for us, so long as we use
the --robot
option when installing zfp
.
Installing
Now we're ready to install. We just use the eb
alias with our EasyConfig,
and hope things go well!
We can run shorter installs like this on a development
node. By default, EasyBuild will try to parallelize
compilation using all of the cores on the machine. To be a good dev node
neighbor, we can use the --parallel
option to only use a few of the cores,
and leave the rest of the machine useable for everyone. For longer builds, you
might consider running EasyBuild through a
batch or interactive
job.
And don't forget to change your build and software install paths!
input | |
---|---|
1 2 3 |
|
output | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
|
This process should less than a minute in total.
There's a lot of output here, but we can see that it completed steps like configuring, building, testing, installing, and checking that important files are where they're supposed to be. And not only would it do this for the software you requested, but for all of the dependencies too. You would usually have to do this all manually!
We can check that the software is where it's supposed to be.
input | |
---|---|
1 |
|
output | |
---|---|
1 |
|
input | |
---|---|
1 |
|
output | |
---|---|
1 |
|
Note that the installation is stored under the zfp
directory in a directory
labeled with the software and toolchain versions. If, in the future, we wanted
a new version or one built with a different compiler, these two versions can
coexist in different directories.
Using the software
Now that it's installed, we can try to test it with the testzfp
executable.
But just because we know where the software is, doesn't mean we're ready to run
it! When we install software, it needs to be linked to the proper libraries
that also need to be available when the software runs. This includes all the
dependencies of zfp
!
The right way to do this is to use the module file that comes when we install
the software. EasyBuild installs modules into
$HOME/.local/easybuild/modules/all
by default. We can add these to our
"module path" so that they show up when we try to do module load
:
input | |
---|---|
1 2 |
|
output | |
---|---|
1 |
|
Let's try searching for it:
input | |
---|---|
1 |
|
output | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
|
Notice that the version is followed by the toolchain it depends on. Now we can load it using the command module spider
gave.
input | |
---|---|
1 2 |
|
output | |
---|---|
1 2 |
|
This does a few things including
- adding
zfp
'sbin
directory to our path, - adding
zfp
'slib
directory to ourLD_LIBRARY_PATH
(so we can compile new programs using the libraries it provides in the future), - and loading the modules for its runtime dependencies.
This means we can run testzfp
without using it's absolute path:
input | |
---|---|
1 |
|
output | |
---|---|
1 2 3 4 5 6 7 8 9 10 |
|
Though this was a small example, this workflow should get you through most EasyBuild installations. Checkout our EasyBuild reference in the future if you need a quick refresher of the most important commands.
(Oh, and you can delete the $HOME/.local/easybuild/modules
and
$SCRATCH/software
directories to start fresh for your real installations.)