This is as a Lab Notebook which describes how to solve a specific problem at a specific time.
Please keep this in mind as you read and use the content. Please pay close attention to the date, version
information and other details.
Lab Notebook --- Using EasyBuild to install a PostgreSQL compatible with R (2023-05-18)
Problem setup
A user wanted to use PostgreSQL with the R/4.2.2 module. However, the versions
of PostgreSQL installed
So we need a version of PostgreSQL which is compatible with GCC/11.3.0.
Solution
We often use EasyBuild to install software on the HPCC. One of the nice things
about EasyBuild is that other users can contribute EasyConfigs which are
recipes to build and install different types of software.
Loading EasyBuild
To get started, we load the EasyBuild module:
12
$modulepurge
$moduleloadEasyBuild
We now have access to the eb command and two aliases defined by MSU HPCC
staff: ebF to find EasyConfigs and ebS to install software. We can first
check our global EasyBuild configuration using
Since I (grosscra) am part of the helpdesk group used by HPCC staff, my
--show-config will look different from other user's configuration. In
particular, I am set up to install the software into the root directory /opt,
with modules going into /opt/modules and the actual software going into
/opt/software.
A quick digression on local modules
For a user not in helpdesk you will have directories in your $HOME
directory (e.g., software in $HOME/software and modules in $HOME/modules).
Thus, using EasyBuild, you can build your own software. If you add your module
directory to your module path using
you can then load modules that you install using the exact same commands you
use on the HPCC (e.g., module load PostgreSQL).
Finding our EasyConfig
So now that we're happy with and (mostly) understand our eb --show-config
results, we can try finding the EasyConfig for PostgreSQL we'd like to use. Our
first step is to use the ebF alias:
This tells us that there many EasyConfigs available to help us install
different versions PostgreSQL under different toolchains.
What is a toolchain?
A toolchain is a set of software dependencies used to install new software.
Most often, this is a compiler like GCC or a compiler/MPI pair like GCC and
OpenMPI. The most basic toolchains are just single compilers and are
labeled using their software version (like GCCcore-11.2.0).
EasyBuild organizes installed modules by toolchain. For example, if you
look for the R/4.2.2 module file, it's under
/opt/modules/MPI/GCC/11.2.0/OpenMPI/4.1.1/R/4.2.2.lua because it was
built using a GCC/OpenMPI toolchain.
Some of these are so commonly used that EasyBuild groups dependency
software into larger toolchains like "foss" and "intel" that contain a
compiler/MPI pair and a number of other common dependencies. These are
labeled by their year and an a or b for the first or second half of the
year. You can check what's in them by searching for their EasyConfig and
showing it with eb --show-ec:
$ebFfoss
...
======$ebF_PATH/f/foss/
foss-2016.04.ebfoss-2016b.ebfoss-2018b.ebfoss-2021a.ebfoss-2022b.eb
foss-2016.06.ebfoss-2017a.ebfoss-2019a.ebfoss-2021b.eb
foss-2016.07.ebfoss-2017b.ebfoss-2019b.ebfoss-2022.05.eb
foss-2016.09.ebfoss-2018.08.ebfoss-2020a.ebfoss-2022.10.eb
foss-2016a.ebfoss-2018a.ebfoss-2020b.ebfoss-2022a.eb
...
$eb--show-ecfoss-2022a.eb
easyblock='Toolchain'name='foss'version='2022a'homepage='https://easybuild.readthedocs.io/en/master/Common-toolchains.html#foss-toolchain'description="""GNU Compiler Collection (GCC) based compiler toolchain, including OpenMPI for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW and ScaLAPACK."""toolchain=SYSTEM
local_gccver='11.3.0'# toolchain used to build foss dependencieslocal_comp_mpi_tc=('gompi',version)# we need GCC and OpenMPI as explicit dependencies instead of gompi toolchain# because of toolchain preparation functionsdependencies=[('GCC',local_gccver),
('OpenMPI','4.1.4','',('GCC',local_gccver)),
('FlexiBLAS','3.2.0','',('GCC',local_gccver)),
('FFTW','3.3.10','',('GCC',local_gccver)),
('FFTW.MPI','3.3.10','',local_comp_mpi_tc),
('ScaLAPACK','2.2.0','-fb',local_comp_mpi_tc),
]moduleclass='toolchain'
We can see that foss includes GCC, OpenMPI, FlexiBLAS, FFTW,
FFTW.MPI, and ScaLAPACK.
Since we know that R/4.2.2 requires GCC/11.2.0 to load, we look for a
PostgreSQL EasyConfig with a compatible toolchain. In this case, we see
PostgreSQL-13.4-GCCcore-11.2.0.eb. If a version of PostgreSQL other than 13.4
were required, we would probably need to generate a new EasyConfig, but this
version was suitable for the user.
Fixing dependency resolution
Now let's try to see how the installation will go. Since I'm writing this
after having installed PostgreSQL/13.4, the output will look different than
before it was installed. Instead, I'll use
PostgreSQL-14.4-GCCcore-11.3.0.eb which as of now is still not installed.
We can check to see if we're missing any of this install's dependencies on the
system using eb -M:
What's happening is that the way modules are searched for on the HPCC are
different than the way EasyBuild searches for them. EasyBuild wants to include
"Core/" in front of the core module names (i.e., those that aren't installed
under a toolchain). But if we look at where modules are searched for,
the "Core" part is already included in the path. This makes it so you don't need to use module load Core/GCC/11.3.0 and can get right to the software name you need.
To make things work correctly with EasyBuild's expectations , we can add
/opt/modules to our module path and try again:
Much better! Now we're only missing the module we want to install.
In the case where we would actually be missing dependencies, EasyBuild would
install those for us, so long as we use the --robot option when installing.
This is included in the ebS alias by default.
Installing
Now we're ready to install. We just use the ebS alias with our EasyConfig,
and hope things go well!