EasyBuild Tutorial
One of the most complex parts of using an HPCC can often be installing the software you want to use. EasyBuild is a piece of software that helps simplify the process. It compiles well-tested recipes contributed by people installing this software (often on HPCCs just like MSU's) all over the world. It's also what ICER uses to install all of the modules you use on the HPCC!
A warning: while "easy" is in the name, don't ever expect installing software on the HPCC to be easy... But EasyBuild is probably the closest you'll get.
What kind of software do you need?
EasyBuild can help you install all kinds of software, but there are other
options that we recommend for things like Python and R. For Python, we
recommend Anaconda, and for R we recommend using the
built-in install.packages
command. EasyBuild will be most helpful if you
need to compile a piece of software from scratch that somebody else has
created a recipe for.
In this tutorial, we are going to try to install a piece of software that's not already on the HPCC called zfp
.
Loading EasyBuild
To get started, we load the EasyBuild module:
1 2 |
|
We now have access to the eb
command that does everything you need in
EasyBuild as the alias ebS
which takes the place of the eb
command (with
some nice defaults setup).
Configuring EasyBuild
We can first check our global EasyBuild configuration using
input | |
---|---|
1 |
|
output | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
Whoa! There are a lot of options here! Some of them have been setup by HPCC
staff to make your life easier (compare with the output of eb --show-config
),
and you shouldn't worry about most of them. However, let's highlight the
important ones:
installpath
-
This is the root directory for your software installation. In this case, it's my (
k0068027
) home directory. Usually the software and modules both fall under this directory, but can be set separately (see the next two options). installpath-software
-
This is where all of your software is actually installed. When your installation is finished, you should be able to find it under its name in this directory.
installpath-modules
-
This is where the module files are stored for your software installations. What's a module file? It's how
module load mysoftware
works! So after you install something with EasyBuild, you'll have built your own personal module that you can load like anything else on the HPCC! buildpath
-
This is where the software is compiled. Usually, keeping it as a
/tmp
directory is good since it's fast storage on the node for lots of small reading and writing. Once it's built, it gets moved to yourinstallpath-software
directory anyways, so it really is temporary.
These are usually good defaults, but you might want to change them. For example, what if you need to install a piece of software in a research space so everyone in your group can access it? Or what if your home directory is filling up, you only need the software temporarily, and are okay installing it into your scratch space?
For the sake of this tutorial, we'll practice by installing the software into our scratch space, but leaving the module files in our home directory.
To change the configuration, we can do it by passing the new value as a command line argument to any ebS
command:
input | |
---|---|
1 |
|
output | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 |
|
Great! It also tells us that this option was set using a command line argument
by the (C)
. We'll have to make sure to include this option when we actually
try to install the software: it's not "set and forget"!
As you can see from the output, there are multiple ways to configure EasyBuild. If you want to make more lasting changes that are "set and forget", try setting up a configuration file.
Finding our EasyConfig
So now that we're happy with and (mostly) understand our ebS --show-config
results, we can try finding the recipe for the software we'd like to install.
These recipes are called EasyConfigs and there's a good chance that someone
has already created one for the software you're trying to install.
The list of EasyConfigs is stored on the HPCC and we can search through it
using the -S
option of ebS
:
input | |
---|---|
1 |
|
output | |
---|---|
1 2 3 4 5 |
|
This tells us that there a few different EasyConfigs available to help us install
different versions of zfp
under different toolchains.
What is a toolchain?
A toolchain is a set
of software dependencies used to install new software. Most often, this is
a compiler like GCC or a compiler/MPI pair like GCC and OpenMPI. The most
basic toolchains are just single compilers and are labeled using their
software version (like GCCcore-10.2.0
).
EasyBuild organizes installed modules by toolchain. For example, if you
look for the R/4.2.2 module file, it's under
/opt/modules/MPI/GCC/11.2.0/OpenMPI/4.1.1/R/4.2.2.lua
because it was
built using a GCC/OpenMPI toolchain.
Some of these are so commonly used that EasyBuild groups dependency
software into larger toolchains like
foss
and
intel
that contain a compiler/MPI pair and a number of other common dependencies.
These are labeled by their year and an a
or b
for the first or second
half of the year. You can check what's in them by searching for their
EasyConfig and showing it with eb --show-ec
:
input | |
---|---|
1 |
|
output | |
---|---|
1 2 3 4 5 6 7 8 |
|
input | |
---|---|
1 |
|
output | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
|
We can see that foss
includes GCC
, OpenMPI
, FlexiBLAS
, FFTW
,
FFTW.MPI
, and ScaLAPACK
.
We'll try to install the newest version of zfp
with the newest compiler
there's a corresponding EasyConfig for: zfp-1.0.0-GCCcore-10.3.0.eb
.
Checking dependencies
One of the great part of EasyBuild is that it will handle the dependencies of the software you're using for you. If they're not already installed, it will use other EasyConfigs to install them.
We can check to see if we're missing any of zfp
's dependencies on the system
using ebS -M
:
input | |
---|---|
1 |
|
output | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 |
|
So we're missing seven dependencies; the rest are already available on the
HPCC. EasyBuild will install those for us, so long as we use the --robot
option when installing zfp
. This is option included in the ebS
alias by
default, so you won't need to worry about it.
"It's already on the HPCC, why is it missing!"
Unfortunately, since some HPCC modules are behind other "gateway" modules
(e.g., to load R
, you first have to load GCC
and OpenMPI
), they are
unavailable as dependencies of user installs. If the software you try to
install needs something behind one of these gateways as a dependency,
EasyBuild will install another copy for you.
Installing
Now we're ready to install. We just use the ebS
alias with our EasyConfig,
and hope things go well!
We can run shorter installs like this on a development node. By default, EasyBuild will try to parallelize compilation using all of the cores on the machine. To be a good dev node neighbor, we can use the --parallel
option to only use a few of the cores, and leave the rest of the machine useable for everyone. For longer builds, you might consider running EasyBuild through a batch or interactive job.
And don't forget to change your software install path!
input | |
---|---|
1 |
|
output | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
|
This process should take about five minutes in total.
There's a lot of output here, but we can see that it completed steps like
configuring, building, testing, installing, and checking that important files
are where they're supposed to be. And not only did it do this for zfp
, but
for all of the dependencies too. You would usually have to do this all
manually!
We can check that the software is where it's supposed to be.
input | |
---|---|
1 |
|
output | |
---|---|
1 |
|
input | |
---|---|
1 |
|
output | |
---|---|
1 |
|
Note that the installation is stored under the zfp
directory in a directory
labeled with the software and toolchain versions. If, in the future, we wanted
a new version or one built with a different compiler, these two versions can
coexist in different directories.
Using the software
Now that it's installed, we can try to test it with the testzfp
executable.
We know exactly where it is, so let's try and run it from it's installation
directory:
input | |
---|---|
1 |
|
output | |
---|---|
1 2 |
|
Well that doesn't look right... It looks there are some missing libraries. The real issue is that we didn't load the module for the new software we built! That would have also loaded all the dependencies for us!
Remember that modules are installed into $HOME/modules
. We can add these to
our "module path" so that they show up when we try to do module load
:
input | |
---|---|
1 2 |
|
output | |
---|---|
1 |
|
Let's try searching for it:
input | |
---|---|
1 |
|
output | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
Notice that the version is followed by the toolchain it depends on. Now we can load it using the command module spider
gave.
input | |
---|---|
1 2 |
|
output | |
---|---|
1 2 |
|
This does a few things including
- adding
zfp
'sbin
directory to our path, - adding
zfp
'slib
directory to ourLD_LIBRARY_PATH
(so we can compile new programs using the libraries it provides in the future), - and loading the modules for its runtime dependencies.
This means we can run testzfp
without using it's absolute path:
input | |
---|---|
1 |
|
output | |
---|---|
1 2 3 4 5 6 7 8 9 10 |
|
Though this was a small example, this workflow should get you through most EasyBuild installations. Checkout our EasyBuild reference in the future if you need a quick refresher of the most important commands.
(Oh, and you can delete the $HOME/modules
and $SCRATCH/software
directories to start fresh for your real installations.)