Skip to content

MATLAB

About MATLAB

  • There is a Matlab portal for Michigan State University users. Users are encouraged to visit it for information and support provided by Mathworks.

  • Several versions of MATLAB are installed on the cluster. By default, MATLAB/2018a is loaded. Other available version of MATLAB can be discovered by typing

1
hpc@dev-intel18:~> module spider MATLAB

and then switching to a different version, for example, to switch from 2018a to 2019a, one can directly load the the version as

1
hpc@dev-intel18:~> module load MATLAB/2019a 
  • MATLAB's many built-in functions have multi-threaded capability. On your personal computer, when a MATLAB function with multi-threads is called, MATLAB will automatically spawn as many threads as the number of cores on the machine. To avoid over utilizing compute nodes on HPCC, user should set the max number of threads by using maxNumCompThreads(N) in matlab where N is the maximum number of threads matlab would use in the session. User could also use option -SingleCompThread to launch matlab session that would only use a single thread. Without these option, matlab session will potentially spawn as many as 28 (on intel16 nodes) or 20 (on intel14 nodes) threads when a built-in multi-threaded function is called. To allow the multi-thread functions in matlab, users need to do the following:
    1. Specify the maximum number of compute threads to be used with  maxNumCompThreads(N) at the beginning of the matlab program where N is the maximum number of threads in the program. For example, maxNumCompThreads(4) will set the maximum number of threads used in the program to four.  
    2. If submitting to run as batch job, specify --cpus-per-task=N in your job script where N should match the maximum number of compute threads set in maxNumCompThreads(N).

Warning

Starting Nov. 1, 2018 on new HPCC system, the matlab default setting of using a single compute thread is changed. matlab-mt and matlab commands would be the same for 2018 and older versions and matlab-mt will no longer exist from later versions starting 2019.

  • HPCC has many toolboxes installed. To see a list of installed toolboxes and licenses, type
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
hpc@dev-intel18:~> module load powertools
hpc@dev-intel18:~> licensecheck matlab
Checking Licenses for matlab
lmstat -a -c 27000@lm-02.i

Users of MATLAB:  (Total of 10000 licenses issued;  Total of 44 licenses in use)
Users of SIMULINK:  (Total of 10000 licenses issued;  Total of 1 license in use)
Users of Bioinformatics_Toolbox:  (Total of 10000 licenses issued;  Total of 0 licenses in use)
Users of Communication_Toolbox:  (Total of 10000 licenses issued;  Total of 0 licenses in use)

......

Running MATLAB on Scratch Space

It is strongly recommended that iCER users run programs on the scratch space. This may improve the speed of job execution as well as the whole system's performance. MATLAB users should carefully check whether any temporary files involved in the program execution need to be stored in scratch space. For example, if you use the Matlab compiler to make a Matlab program into a standalone program, you need to set the environment variable MCR_CACHE_ROOT in the scratch directory with: export MCR_CACHE_ROOT=$SCRATCH before starting the execution. This line should be added to the job script before the line where you specify the task you want to run. This setting will override the default setting by MATLAB Compiler Runtime. By default, a directory for temporary data cache used by the MATLAB Compiler Runtime is created at user's home directory $HOME. Without this setting, users may run into the situation that the program running on scratch space frequently accesses its cache space in the home space, which will greatly slow down the execution of the code, and may potentially slow down the whole system or even cause system instability.  

Running MATLAB Interactively

In this document, we refer to an interactive session as one that involves a user typing commands into the MATLAB command windows.

The simplest way to run MATLAB as a graphical application is to use an OnDemand session. However, if you only wish to use the command line or use the graphics through ssh, see the instructions below.

Short Sessions (< two hours)

  • ssh to one of the dev nodes and run Matlab:

    1
    2
    3
    4
    5
    6
    7
    8
    hpc@gateway:~> ssh dev-intel16
    Last login: Mon Dec  4 12:54:44 2017 from gateway
    ===
    This front-end node is not meant for running big or long-running jobs.  Jobs
    that need to run longer than a few minutes should be submitted to the queue.
    Long-running jobs on front-end nodes will be killed without warning.
    ===
    hpc@dev-intel16:~> matlab -nodisplay
    

    More information about running jobs interactively on compute nodes can be reviewed at Running Programs Interactively

  • If you require graphics, please ensure that you have an Xserver running. For Linux and MAC users,

    1
    ssh -X username@hpcc.msu.edu
    

    If you want graphics, but don't want the desktop, type

    1
    hpc@dev-intel16:~> matlab -nodesktop
    

    This will let you run code from the command line instead of from the IDE interface, but will still allow you to use graphics (e.g., make plots).

Interactive MATLAB jobs running on development nodes are limited to a two hour wall time limit, and will be killed automatically after two hours.

Long Sessions (> two hours)

Longer interactive sessions are possible, but are not recommended. Modify the following commands to suit your requirements.

If graphics are not required,

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
hpc@dev-intel18:-> salloc --nodes=1 --ntasks=1 --cpus-per-task=1 --mem=2gb --time=04:00:00 


salloc: Granted job allocation 310982

salloc: Waiting for resource configuration

salloc: Nodes lac-376 are ready for job

hpc@dev-intel18:-> matlab -nodisplay 

If graphics are required, add the option –x11 to the salloc command:

1
hpc@dev-intel18:-> salloc --nodes=1 --ntasks=1 --cpus-per-task=1 --mem=2gb --time=04:00:00 --x11

Warning

The above commands submit a job to the cluster. If the resources are not immediately available, you will have to wait till the requested resources are available. Requesting a job for four hours or less will typically be scheduled relatively quickly. Users may need to adjust the resources accordingly with the usage of the MATLAB program.

Running MATLAB Non-Interactively

Short Jobs (< two hours)

A short job could be run on a development node without opening the matlab command window. From a development node, type

1
hpc@dev-intel16:~> matlab -nodisplay -r "myMatlab" &

myMatlab.m will start running in the background.

Long Jobs (> two hours)

Single MATLAB Job

To submit jobs to the cluster, a job script needs to be written. And submitted to the queue. The following sample job script file can be modified to suit your needs:

myJob.sbatch

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#!/bin/bash

### Specify the Resources Needed

#SBATCH --time=02:05:00 
#SBATCH --nodes=1 
#SBATCH --ntasks=1 
#SBATCH --cpus-per-task=1 
#SBATCH --mem-per-cpu=1G        

### Set SLURM Admin parameters

#SBATCH --job-name=myJobName
#SBATCH --output=%x-%j.SLURMout
#SBATCH --mail-type=ALL
#SBATCH --mail-user=myNetID@msu.edu

echo "This script is from ICER's MATLAB tutorial"

### Navigate to your data directory and run Matlab

cd $HOME/my                           
matlab -nodisplay -r "myMatlab"
  • --time=02:05:00 is the number of hours:minutes:seconds your job needs to run. If it runs longer than this, it will be killed. If you request more time than you need, your job may be delayed while the scheduler finds a time to run it. If you don't know how long your job needs, you will have to make a guess and use the real running time to improve this number on future runs. The maximum walltime that can be requested is 168:00:00.
  • --nodes=1 because you are running one matlab client in the job. If you want put multiple matlab run in a single job script, you may request more nodes for the job.
  • --cpus-per-task=1 here because the matlab script MyMatlab.m does not use multiple threads. If you are using multiple threads, you may need to request more cpus.
  • --mem-per-cpu=1G reserves 1 gigabytes per CPU of memory for the job. We recommend user to serve at least 1GB for each session plus the total size of data variables used in the computation. User could click here to see the recommendation of the matlab requirement by Mathworks.
  • myJobName is a string to make your job easier to identify it when managing or monitoring your jobs.
  • --output=%x-%j.SLURMout sets the SLURM output filename to myJobName-IDnumber.SLURMout
  • myMatlab is the name of your matlab script without the .m extension
  • --mail-type=ALL tells SLURM to email you with ALL job scheduling events e.g., start, complete, exit-code
  • --mail-user=myNetID@msu.edu sets the email address SLURM uses; change myNetID@msu.edu to your prefered email address

Submit your job with:

1
hpc@dev-intel16:~> sbatch myJob.sbatch

myJob.sbatch is the name of your job script as shown above.

Using the MATLAB Parallel Computing Toolbox

The MATLAB Parallel Computing Toolbox (PCT) provides users several parallel computing features. 

  • Parallel for-loops (parfor) .  (User could run module load powertools; getexample MATLAB_parfor to download a directory "MATLAB_parfor" which contains an example of using parpool and parfor with the "local" profile)

  • Support for GPU computing

  • Offload computing from your laptop to HPCC cluster (with MATLAB Distributed Computing Server) 

  • Distributed arrays and spmd (single-program-multiple-data) for large dataset handling and data-parallel algorithms

    1. One GPU card will be used for each worker. In order to use multiple GPUs, user need to use spmd capability that each instance of the program will use one card and multiple instances of the program take multiple cards.
    2. If you use GPU capability, you need to have matlab run on a node with GPU. dev-intel16-k80 and dev-amd20-v100 are the development nodes with GPUs. To request GPUs, use –-gpus=<type>:<number> to request a number and type of GPU. #SBATCH ---gpus=k80:1 is an example to request one k20 GPU. Valid GPU type are k80 and v100. Note that type is optional, but the number of GPU is necessary.

Using the MATLAB Parallel Server

The MATLAB Parallel Server lets users solve computationally and data-intensive problems by executing MATLAB and Simulink based applications on the HPCC cluster and clouds. (see the document for more information). HPCC cluster has this product installed.

We recommend that users prototype their applications using the Parallel Computing Toolbox, and then scale up to a cluster using MATLAB Parallel Server. To scale up to cluster, user does not need to recode the program. User only need to change the profile of the cluster. 

Setup and validate your cluster profile

In this step you define a cluster profile to be used in subsequent steps.

  1. Start the Cluster Profile Manager from the MATLAB desktop by selecting on the Home tab in the Environment area Parallel > Create and Manage Clusters. 

  2. Create a new profile in the Cluster Profile Manager by selecting Add Cluster Profile > Slurm.

  3. With the new profile selected in the list, click Rename to edit the profile name, Press Enter.

  4. Select a profile in the list, click Edit to edit the profile accordingly. After finishing editing, click Done to save the profile.

  5. Click validation to validate the profile. The profile could be used when it pass all the validation tests.

Starting from version 2018a, MATLAB supports the SLURM scheduler. Please refer to Mathworks' document for how to configure the cluster with the SLURM scheduler here. Previous versions of MATLAB are not recommended for use with the HPCC scheduler.

Using the MATLAB thread-based worker pool

Started from Matlab version R2020a, thread-based worker pool is introduced. Please refer to this document for more details. On HPCC, we provide our users an example showing how to use thread-base pool with parfor, as well as the comparison between process-based and thread-based pools. To obtain the example, module load powertools and MATLAB/2020a, then run getexample MATLAB_threadPool

Running the MATLAB/2020a on AMD nodes

There is a bug in MATLAB/2020a that will lead to a "segmentation fault" on AMD node associated with the Java virtual machine. The patch may be introduced in the next release. If you find that the code works on other version but crashes in 2020a version, you may try the workaround that launch the matlab session without java virtual machine as the following:

1
[hpc@eval-epyc19 ~]$ matlab -nodisplay -nojvm -r "myExample"

Running MATLAB on intel14 nodes

There is an existing problem of running some functions on Intel14 nodes. If you run into 'Illegal instruction detected' error, please use a node of other type. Please add constrain to exclude intel14 type of compute nodes when submit to SLURM. For example:

1
#SBATCH --constraint=[intel16|intel18|amd20]

Using MATLAB with Python

Here is the link to the cheat sheets provided by the Mathworks for users' reference.