Module System Tutorial
The HPCC has a large amount of software installed in order to support its diverse users. This can include multiple versions of the same software. The module system exists to manage all of this by making software available to users and preventing version conflicts.
In this tutorial, you'll learn how to use the module system to:
- See which modules are currently loaded
- Search for available software versions
- Check requirements for particular modules
- Loading modules
- Saving currently loaded modules to easily reload
For the purposes of this tutorial, we'll be trying to load version 3.6.3 of the R interpreter.
Viewing Currently Loaded Modules
Several modules are already loaded by default once you log on to a development node. These include several commonly used packages such as Python, MATLAB, and the GNU compiler.
Run module list
to see all currently available modules. There is a long list of modules available by default. If the output of module list
is too big for your screen, it will be displayed using a program called less
.
The less
program will let you scroll up and down through the list with the arrow keys. You can press q
to exit less
at any time. You will also automatically exit once you scroll to the end of the module list
output. To search within less
, type the /
character followed by the string you want to search for, then press enter. Use the n
and p
keys to jump between the next and previous matches respectively.
If we search through the output of module list
, we'll see that the default version of R is 4.3.2-gfbf-2023a
. This label means the module loads version 4.3.2 of R where R has been compiled with the 2023a version of the gfbf
toolchain. However, recall that we want to load R version 3.6.3 for this tutorial. Let's learn more about the module system to see how to accomplish this.
Searching for Available Modules
In addition to searching the Available Software in this documentation, users can use the module
command accepts
a variety of "sub-commands". The example we used above, list
, is an example of a sub-command.
What other sub-commands are available? Run module
by itself to find out.
You'll see a long list of available sub-commands printed to the screen. Scroll up until you see the portion on listing and searching:
Listing / Searching sub-commands:
---------------------------------
list List loaded modules
list s1 s2 ... List loaded modules that match the pattern
avail | av List available modules
avail | av string List available modules that contain "string".
spider List all possible modules
spider module List all possible version of that module file
spider string List all module that contain the "string".
spider name/version Detailed information about that version of the
module.
whatis module Print whatis information about module
keyword | key string Search all name and whatis that contain "string".
Here we see the sub-command list
, which we've already encountered. We'll cover avail
and keyword
in other documentation. For now, let's focus on the spider
sub-command.
The spider
sub-command is the most useful way to search through available modules. Its name isn't obvious, but you can think of sending a spider to walk through a tangled web of modules to find the ones matching your request.
The list of sub-commands from module
shows four options for the module spider
sub-command:
Argument | Output |
---|---|
None | All possible modules |
module |
All versions of that module |
string |
All modules containing string |
name/version |
Details about a module version |
We'll cover module
and string
search in the next section on searching by module name. After that we'll cover the name/version
search for loading a specific version.
Searching by Module Name
For this tutorial we want to search for the versions of the R interpreter. Within the module system on the HPCC, "R" with a capital R is the formal name of the module.
Run module spider R
. An abbreviated output is reproduced below:
----------
R:
----------
Description:
R is a free software environment for statistical computing and graphics.
Versions:
R/3.6.3-foss-2022b
R/4.2.2-foss-2022b
R/4.3.2-gfbf-2023a
R/4.3.3-gfbf-2023b
Other possible modules matches:
ADMIXTURE AOFlagger APR APR-util Amber Armadillo Arrow Avogadro2 ...
----------
To find other possible module matches execute:
$ module -r spider '.*R.*'
----------
For detailed information about a specific "R" package (including how to load the modules) use the module's full name.
Note that names that have a trailing (E) are extensions provided by other modules.
For example:
$ module spider R/4.3.3-gfbf-2023b
----------
We see a full list of all available R versions as well as some other helpful information:
- Other possible module names we may have been searching for
- How to search for all modules containing the string "R"
- How to get detailed information on a specific version
The first two points reference searching by string
in the table above, rather than searching by module
. The third point references the name/version
search.
What's the difference between searching by module
and searching by string
? As mentioned at the start of this section, "R" is the formal name of the module. The module system tries to be case insensitive, but it can have odd results.
Run module spider r
with a lowercase r and you'll see we've executed a string
search, returning all modules that include the letter R!
This is probably more information than you would like. Press q
to quit and return to the terminal.
That said, searching by string
is powerful if you don't know the precise name of the module you are looking for.
Details on a Module Version
All modules listed by module spider
follow the name/version
format. We can use this format to get more information on a specific version. As with the default R module, the version
tag contains information about the toolchain with which that version of R was compiled.
We're interested in using R version 3.6.3, so run module spider R/3.6.3-foss-2022b
.
You'll again see a long list of output. We are interested in the following portions:
This module can be loaded directly: module load R/3.6.3-foss-2022b
This module provides the following extensions:
base (E), compiler (E), datasets (E), graphics (E), grDevices (E), grid (E), methods (E), parallel (E), splines (E), stats (E), stats4 (E), tcltk (E), tools (E), utils (E)
The phrase "this module can be loaded directly" means that we don't have to load any additional modules before loading R version 3.6.3. Any of its dependencies - such as the FOSS 2022b toolchain it was built with - will be loaded automatically when we load this one module.
It is also worth noting that R provides a number of extensions. These are packages that are installed alongside R. In this case, these packages can be loaded within R. Some modules may also have extensions that provide additional command line tools.
Loading Modules
As we saw in the last section, when checking module details, R version 3.6.3 can be loaded "directly" without first loading any dependencies. This is because any necessary dependencies will be loaded automatically, including those from its compiler toolchain.
Let's attempt to load our desired version of R:
module load R/3.6.3-foss-2022b
This results in an error! The first two sentences are the most informative part of this error:
Lmod has detected the following error: The previous module command attempted to
load "R/3.6.3-foss-2022b" while "R/4.3.2-gfbf-2023a" was already loaded. This is likely due
to loading a module with incompatible dependencies from the one currently loaded.
Since R/4.3.2-gfbf-2023a
and all of its dependencies are already loaded as default modules, we'll need to clear out our currently loaded modules if we want to load a module from a different toolchain. Run module purge
to remove all currently loaded modules. You can confirm all modules have been unloaded with module list
.
Now, we can run module load R/3.6.3-foss-2022b
again. You will not see any output after this command. Run module list
if you would like to confirm it worked.
Saving and Restoring Loaded Modules
We often use the same pieces of software over and over on the HPCC. Remembering all the modules we need every time we log in can be a hassle.
Thankfully, the module system lets us save and restore different configurations.
Saving a Configuration
Now that we only have GCC, OpenMPI, and R version 4.1.0 loaded, let's save a configuration to easily access these modules later. We'll name it R-example
.
Type module save R-example
. You should see the following confirmation message:
Saved current collection of modules to: "R-example"
Restoring a Configuration
Log out of the HPCC and log back in to a development node. This will reset your loaded modules to the default; run module list
to confirm this is the case.
Run module savelist
to see our stored configurations. Confirm that you see R-example
:
Named collection list :
1) R-example
Now, run module restore R-example
. The existing modules will automatically be purged and your desired modules loaded!
Further Resources
You should now understand the basics of the module system. For a refresher on searching for modules, see Searching software modules. You can also look through available modules online. Please note that searching via the command line will always yield the most up-to-date results.