Skip to content

Module System Tutorial

The HPCC has a large amount of software installed in order to support its diverse users. This can include multiple versions of the same software. The module system exists to manage all of this by making software available to users and preventing version conflicts.

In this tutorial, you'll learn how to use the module system to:

  • See which modules are currently loaded
  • Search for available software versions
  • Check requirements for particular modules
  • Loading modules
  • Saving currently loaded modules to easily reload

For the purposes of this tutorial, we'll be trying to load version 4.1.0 of the R interpreter.

Viewing Currently Loaded Modules

Several modules are already loaded by default once you log on to a development node. These include several commonly used packages such as Python, MATLAB, and the GNU compiler.

Run module list to see all currently available modules. The output will look like the following:

1
2
3
4
5
6
7
8
9
Currently Loaded Modules:
  1) GCCcore/6.4.0     9) ScaLAPACK/2.0.2-OpenBLAS-0.2.20  17) SQLite/3.21.0
  2) binutils/2.28    10) bzip2/1.0.6                      18) GMP/6.1.2
  3) GNU/6.4.0-2.28   11) zlib/1.2.11                      19) libffi/3.2.1
  4) OpenMPI/2.1.2    12) Boost/1.67.0                     20) Python/3.6.4
  5) tbb/2018_U3      13) CMake/3.11.1                     21) Java/1.8.0_152
  6) imkl/2018.1.163  14) ncurses/6.0                      22) MATLAB/2018a
  7) OpenBLAS/0.2.20  15) libreadline/7.0                  23) powertools/1.2
  8) FFTW/3.3.7       16) Tcl/8.6.8

Unfortunately, R isn't loaded by default. We'll have to figure out what's needed to load it ourselves.

Searching for Available Modules

The module command accepts a variety of "sub-commands". The example we used above, list, is an example of a sub-command.

What other sub-commands are available? Run module by itself to find out.

You'll see a long list of available sub-commands printed to the screen. Scroll up until you see the portion on listing and searching:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
Listing / Searching sub-commands:
---------------------------------
  list                              List loaded modules
  list                s1 s2 ...     List loaded modules that match the pattern
  avail | av                        List available modules
  avail | av          string        List available modules that contain "string".
  spider                            List all possible modules
  spider              module        List all possible version of that module file
  spider              string        List all module that contain the "string".
  spider              name/version  Detailed information about that version of the
                                    module.
  whatis              module        Print whatis information about module
  keyword | key       string        Search all name and whatis that contain "string".

Here we see the sub-command list, which we've already encountered. We'll cover avail and keyword in other documentation. For now, let's focus on the spider sub-command.

The spider sub-command is the most useful way to search through available modules. Its name isn't obvious, but you can think of sending a spider to walk through a tangled web of modules to find the ones matching your request.

The list of sub-commands from module shows four options for the module spider sub-command:

Argument Output
None All possible modules
module All versions of that module
string All modules containing string
name/version Details about a module version

We'll cover module and string search in the next section on searching by module name. After that we'll cover the name/version search for loading a specific version.

Searching by Module Name

For this tutorial we want to search for the versions of the R interpreter. Within the module system on the HPCC, "R" with a capital R is the formal name of the module.

Run module spider R. An abbreviated output is reproduced below:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
----------

  R:
----------
    Description:
      R is a free software environment for statistical computing and graphics.

     Versions:
        R/3.3.1
        ...
        R/4.1.0
        R/4.1.2
        R/4.2.2
     Other possible modules matches:
        ADMIXTURE  AMDuProf  APR  APR-util  Abaqus_parallel  AdapterRemoval  Advisor  ...

----------
  To find other possible module matches execute:

      $ module -r spider '.*R.*'

----------
  For detailed information about a specific "R" package (including how to load the modules) use the module's full name.
  Note that names that have a trailing (E) are extensions provided by other modules.
  For example:

     $ module spider R/4.2.2
----------

We see a full list of all available R versions as well as some other helpful information:

  • Other possible module names we may have been searching for
  • How to search for all modules containing the string "R"
  • How to get detailed information on a specific version

The first two points reference searching by string in the table above, rather than searching by module. The third point references the name/version search.

What's the difference between searching by module and searching by string? As mentioned at the start of this section, "R" is the formal name of the module. The module system tries to be case insensitive, but it can have odd results.

Run module spider r with a lowercase r and you'll see we've executed a string search, returning all modules that include the letter R!

This is probably more information than you would like. Press q to quit and return to the terminal.

That said, searching by string is powerful if you don't know the precise name of the module you are looking for.

Details on a Module Version

All modules listed by module spider follow the name/version format. We can use this format to get more information on a specific version.

We're interested in using R 4.1.0, so run module spider R/4.1.0.

You'll again see a long list of output. The portion we are interested in is this:

1
2
3
You will need to load all module(s) on any one of the lines below before the "R/4.1.0" module is available to load.

      GCC/8.3.0  OpenMPI/3.1.4

This section tells us the required dependencies for our module. A module may have multiple possible sets of dependencies; for instance, different combinations of compilers and MPI libraries. Each different set will be listed on a new line.

For R 4.1.0 however there is only one possible dependency: version 8.3.0 of the GCC module and version 3.1.4 of the Open MPI module.

Now that we understand the dependencies, let's move on to loading our module.

Loading Modules

As we saw in the last section, when checking module details, R 4.1.0 requires the GCC module with version 8.3.0 and the Open MPI module with version 3.1.4.

Let's check the output of module list again:

1
2
3
4
5
6
7
8
9
Currently Loaded Modules:
  1) GCCcore/6.4.0     9) ScaLAPACK/2.0.2-OpenBLAS-0.2.20  17) SQLite/3.21.0
  2) binutils/2.28    10) bzip2/1.0.6                      18) GMP/6.1.2
  3) GNU/6.4.0-2.28   11) zlib/1.2.11                      19) libffi/3.2.1
  4) OpenMPI/2.1.2    12) Boost/1.67.0                     20) Python/3.6.4
  5) tbb/2018_U3      13) CMake/3.11.1                     21) Java/1.8.0_152
  6) imkl/2018.1.163  14) ncurses/6.0                      22) MATLAB/2018a
  7) OpenBLAS/0.2.20  15) libreadline/7.0                  23) powertools/1.2
  8) FFTW/3.3.7       16) Tcl/8.6.8

Currently we have GCCcore/6.4.0 and OpenMPI/2.1.2 loaded. Notice GCCcore is not the same as the GCC module.

Both GCC core and Open MPI configure several compilers. Their many versions are dependencies for many modules. Since we already have several modules loaded by default which may or may not require both GCC core 6.4.0 and Open MPI 2.1.2, it's best to start with a clean module environment.

Run module purge to remove all currently loaded modules. You can confirm all modules have been unloaded with module list.

What happens if we just try to load R directly? Run module load R/4.1.0 and you should see the following error:

1
2
3
Lmod has detected the following error:  These module(s) or extension(s) exist
but cannot be loaded as requested: "R/4.1.0"
   Try: "module spider R/4.1.0" to see how to load the module(s).

We cannot load a module without first loading its dependencies! The error message gives us a helpful reminder about how to learn what those requirements are, but we already know that R 4.1.0 requires GCC/8.3.0 and OpenMPI/3.1.4.

Run module load GCC/8.3.0 OpenMPI/3.1.4 R/4.1.0 to load R along with its dependency.

You will not see any output after this command. Run module list if you would like to confirm it worked.

Saving and Restoring Loaded Modules

We often use the same pieces of software over and over on the HPCC. Remembering all the modules we need every time we log in can be a hassle.

Thankfully, the module system lets us save and restore different configurations.

Saving a Configuration

Now that we only have GCC, OpenMPI, and R version 4.1.0 loaded, let's save a configuration to easily access these modules later. We'll name it R-example.

Type module save R-example. You should see the following confirmation message:

1
Saved current collection of modules to: "R-example"

Restoring a Configuration

Log out of the HPCC and log back in to a development node. This will reset your loaded modules to the default; run module list to confirm this is the case.

Run module savelist to see our stored configurations. Confirm that you see R-example:

1
2
Named collection list :
  1) R-example 

Now, run module restore R-example. The existing modules will automatically be purged and your desired modules loaded!

Further Resources

You should now understand the basics of the module system. For a refresher on searching for modules, see Searching software modules.