MATLAB
About MATLAB
-
There is a Matlab portal for Michigan State University users. Users are encouraged to visit it for information and support provided by Mathworks.
-
Several versions of MATLAB are installed on the cluster. By default, MATLAB/2018a is loaded. Other available version of MATLAB can be discovered by typing
1 |
|
and then switching to a different version, for example, to switch from 2018a to 2019a, one can directly load the the version as
1 |
|
- MATLAB's many built-in functions have multi-threaded capability. On
your personal computer, when a MATLAB function with multi-threads is
called, MATLAB will automatically spawn as many threads as the
number of cores on the machine. To avoid over utilizing compute
nodes on HPCC, user should set the max number of threads by using
maxNumCompThreads(N)
in matlab where N is the maximum number of threads matlab would use in the session. User could also use option-SingleCompThread
to launch matlab session that would only use a single thread. Without these option, matlab session will potentially spawn as many as 28 (on intel16 nodes) or 20 (on intel14 nodes) threads when a built-in multi-threaded function is called. To allow the multi-thread functions in matlab, users need to do the following:- Specify the maximum number of compute threads to be used with
maxNumCompThreads(N)
at the beginning of the matlab program where N is the maximum number of threads in the program. For example,maxNumCompThreads(4)
will set the maximum number of threads used in the program to four. - If submitting to run as batch job, specify
--cpus-per-task=N
in your job script where N should match the maximum number of compute threads set inmaxNumCompThreads(N)
.
- Specify the maximum number of compute threads to be used with
Warning
Starting Nov. 1, 2018 on new HPCC system, the matlab default setting of
using a single compute thread is changed. matlab-mt
and matlab
commands would be the same for 2018 and older versions and matlab-mt
will no longer exist from later versions starting 2019.
- HPCC has many toolboxes installed. To see a list of installed toolboxes and licenses, type
1 2 3 4 5 6 7 8 9 10 11 |
|
Running MATLAB on Scratch Space
It is strongly recommended that iCER users run programs on the scratch
space. This may improve the speed of job execution as well as the
whole system's performance. MATLAB users should carefully check
whether any temporary files involved in the program execution need
to be stored in scratch space. For example, if you use the Matlab
compiler to make a Matlab program into a standalone program, you
need to set the environment variable MCR_CACHE_ROOT
in the scratch
directory with: export MCR_CACHE_ROOT=$SCRATCH
before starting
the execution. This line should be added to the job script
before the line where you specify the task you want to run. This
setting will override the default setting by MATLAB Compiler
Runtime. By default, a directory for temporary data cache used by
the MATLAB Compiler Runtime is created at user's home directory
$HOME
. Without this setting, users may run into the situation that
the program running on scratch space frequently accesses its cache
space in the home space, which will greatly slow down the execution
of the code, and may potentially slow down the whole system
or even cause system instability.
Running MATLAB Interactively
In this document, we refer to an interactive session as one that involves a user typing commands into the MATLAB command windows.
The simplest way to run MATLAB as a graphical application is to use an
OnDemand session. However, if you only wish to use the
command line or use the graphics through ssh
, see the instructions below.
Short Sessions (< two hours)
-
ssh
to one of the dev nodes and run Matlab:1 2 3 4 5 6 7 8
hpc@gateway:~> ssh dev-intel16 Last login: Mon Dec 4 12:54:44 2017 from gateway === This front-end node is not meant for running big or long-running jobs. Jobs that need to run longer than a few minutes should be submitted to the queue. Long-running jobs on front-end nodes will be killed without warning. === hpc@dev-intel16:~> matlab -nodisplay
More information about running jobs interactively on compute nodes can be reviewed at Running Programs Interactively
-
If you require graphics, please ensure that you have an Xserver running. For Linux and MAC users,
1
ssh -X username@hpcc.msu.edu
If you want graphics, but don't want the desktop, type
1
hpc@dev-intel16:~> matlab -nodesktop
This will let you run code from the command line instead of from the IDE interface, but will still allow you to use graphics (e.g., make plots).
Interactive MATLAB jobs running on development nodes are limited to a two hour wall time limit, and will be killed automatically after two hours.
Long Sessions (> two hours)
Longer interactive sessions are possible, but are not recommended. Modify the following commands to suit your requirements.
If graphics are not required,
1 2 3 4 5 6 7 8 9 10 |
|
If graphics are required, add the option –x11
to the salloc
command:
1 |
|
Warning
The above commands submit a job to the cluster. If the resources are not immediately available, you will have to wait till the requested resources are available. Requesting a job for four hours or less will typically be scheduled relatively quickly. Users may need to adjust the resources accordingly with the usage of the MATLAB program.
Running MATLAB Non-Interactively
Short Jobs (< two hours)
A short job could be run on a development node without opening the matlab command window. From a development node, type
1 |
|
myMatlab.m
will start running in the background.
Long Jobs (> two hours)
Single MATLAB Job
To submit jobs to the cluster, a job script needs to be written. And submitted to the queue. The following sample job script file can be modified to suit your needs:
myJob.sbatch
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
--time=02:05:00
is the number ofhours:minutes:seconds
your job needs to run. If it runs longer than this, it will be killed. If you request more time than you need, your job may be delayed while the scheduler finds a time to run it. If you don't know how long your job needs, you will have to make a guess and use the real running time to improve this number on future runs. The maximum walltime that can be requested is168:00:00
.--nodes=1
because you are running one matlab client in the job. If you want put multiple matlab run in a single job script, you may request more nodes for the job.--cpus-per-task=1
here because the matlab scriptMyMatlab.m
does not use multiple threads. If you are using multiple threads, you may need to request more cpus.--mem-per-cpu=1G
reserves 1 gigabytes per CPU of memory for the job. We recommend user to serve at least 1GB for each session plus the total size of data variables used in the computation. User could click here to see the recommendation of the matlab requirement by Mathworks.myJobName
is a string to make your job easier to identify it when managing or monitoring your jobs.--output=%x-%j.SLURMout
sets the SLURM output filename tomyJobName-IDnumber.SLURMout
myMatlab
is the name of your matlab script without the .m extension--mail-type=ALL
tells SLURM to email you withALL
job scheduling events e.g., start, complete, exit-code--mail-user=myNetID@msu.edu
sets the email address SLURM uses; changemyNetID@msu.edu
to your prefered email address
Submit your job with:
1 |
|
myJob.sbatch
is the name of your job script as shown above.
Using the MATLAB Parallel Computing Toolbox
The MATLAB Parallel Computing Toolbox (PCT) provides users several parallel computing features.
-
Parallel for-loops (parfor) . (User could run
module load powertools; getexample MATLAB_parfor
to download a directory "MATLAB_parfor" which contains an example of using parpool and parfor with the "local" profile) -
Support for GPU computing
-
Offload computing from your laptop to HPCC cluster (with MATLAB Distributed Computing Server)
-
Distributed arrays and
spmd
(single-program-multiple-data) for large dataset handling and data-parallel algorithms- One GPU card will be used for each worker. In order to use
multiple GPUs, user need to use
spmd
capability that each instance of the program will use one card and multiple instances of the program take multiple cards. - If you use GPU capability, you need to have matlab run on a
node with GPU. dev-intel16-k80 and dev-amd20-v100 are the development
nodes with GPUs. To request GPUs, use
–-gpus=<type>:<number>
to request a number and type of GPU.#SBATCH ---gpus=k80:1
is an example to request one k20 GPU. Valid GPU type are k80 and v100. Note that type is optional, but the number of GPU is necessary.
- One GPU card will be used for each worker. In order to use
multiple GPUs, user need to use
Using the MATLAB Parallel Server
The MATLAB Parallel Server lets users solve computationally and data-intensive problems by executing MATLAB and Simulink based applications on the HPCC cluster and clouds. (see the document for more information). HPCC cluster has this product installed.
We recommend that users prototype their applications using the Parallel Computing Toolbox, and then scale up to a cluster using MATLAB Parallel Server. To scale up to cluster, user does not need to recode the program. User only need to change the profile of the cluster.
Setup and validate your cluster profile
In this step you define a cluster profile to be used in subsequent steps.
-
Start the Cluster Profile Manager from the MATLAB desktop by selecting on the Home tab in the Environment area Parallel > Create and Manage Clusters.
-
Create a new profile in the Cluster Profile Manager by selecting Add Cluster Profile > Slurm.
-
With the new profile selected in the list, click Rename to edit the profile name, Press Enter.
-
Select a profile in the list, click Edit to edit the profile accordingly. After finishing editing, click Done to save the profile.
-
Click validation to validate the profile. The profile could be used when it pass all the validation tests.
Starting from version 2018a, MATLAB supports the SLURM scheduler. Please refer to Mathworks' document for how to configure the cluster with the SLURM scheduler here. Previous versions of MATLAB are not recommended for use with the HPCC scheduler.
Using the MATLAB thread-based worker pool
Started from Matlab version R2020a, thread-based worker pool is
introduced. Please refer to this document for
more details. On HPCC, we provide our users an example showing how to
use thread-base pool with parfor
, as well as the comparison between
process-based and thread-based pools. To obtain the example, module load
powertools
and MATLAB/2020a
, then run getexample MATLAB_threadPool
.
Running the MATLAB/2020a on AMD nodes
There is a bug in MATLAB/2020a that will lead to a "segmentation fault" on AMD node associated with the Java virtual machine. The patch may be introduced in the next release. If you find that the code works on other version but crashes in 2020a version, you may try the workaround that launch the matlab session without java virtual machine as the following:
1 |
|
Running MATLAB on intel14 nodes
There is an existing problem of running some functions on Intel14 nodes. If you run into 'Illegal instruction detected' error, please use a node of other type. Please add constrain to exclude intel14 type of compute nodes when submit to SLURM. For example:
1 |
|
Using MATLAB with Python
Here is the link to the cheat sheets provided by the Mathworks for users' reference.