Skip to content

Using the Data Machine

This page acts as a reference for using some of the features of the Data Machine. For more general information on what the Data Machine can offer, please see the Data Machine overview.

Table of Data Machine resources

Node CPUs Memory Local NVME storage GPU GPU memory
acm-048
acm-049
acm-070
acm-071
128 2 TB 32 TB
nal-004
nal-005
nal-006
nal-007
128 512 GB 32 TB 4 NVIDIA A100 GPUs each
split into 7 allocatable units
10 GB (per unit)

Each GPU node has four GPUs each split into seven units that can be requested. Each of these units has 10 GB of memory. These units are requested similarly to normal GPUs. See the examples below.

Running code on the Data Machine

Though the Data Machine is not a buy-in node, the same procedures are used behind the scenes to run on Data Machine nodes. Therefore, users must be added to the data-machine buy-in account to run jobs on the Data Machine. To be added to this account please submit a request using ICER's contact form.

Below are some examples of SLURM resource requests for the Data Machine.

Note

Each of the examples below contains a --nodelist resource specification to force a job to run on the Data Machine. This is unnecessary for most purposes.

Though the --account=data-machine line ensures that the user will be given priority access to Data Machine nodes, SLURM may schedule the job on a non-Data Machine node with those resources on the cluster if it is available sooner. If a job must run on the Data Machine, then the --nodelist line is below should be included. If only the resources specified matter, then removing this line can allow your job to start faster.

Partial Data Machine node

1
2
3
4
5
6
7
#!/bin/bash
#SBATCH --account=data-machine  # Run under the data machine buy-in
#SBATCH --nodelist=acm-[048-049,070-071],nal-[004-007]  # Force job to run on data machine nodes
#SBATCH --nodes=1  # Reserve only one node
#SBATCH --time=4:00:00  # Reserve for four hours (or your desired amount of time)
#SBATCH --mem=256GB  # Set to your desired amount of memory
#SBATCH --cpus-per-task=32  # Set to your desired number of CPUs

Full Data Machine node with no GPU with large memory

1
2
3
4
5
6
7
#!/bin/bash
#SBATCH --account=data-machine  # Run under the data machine buy-in
#SBATCH --nodelist=acm-[048-049,070-071],nal-[004-007]  # Force job to run on data machine nodes
#SBATCH --nodes=1  # Reserve only one node
#SBATCH --time=4:00:00  # Reserve for four hours (or your desired amount of time)
#SBATCH --mem=2TB  # Uses all memory on a large memory node
#SBATCH --cpus-per-task=128  # Uses all CPUs on a node

One GPU unit on a single node

1
2
3
4
5
6
7
8
#!/bin/bash
#SBATCH --account=data-machine  # Run under the data machine buy-in
#SBATCH --nodelist=acm-[048-049,070-071],nal-[004-007]  # Force job to run on data machine nodes
#SBATCH --nodes=1  # Reserve only one node
#SBATCH --time=4:00:00  # Reserve for four hours (or your desired amount of time)
#SBATCH --mem=256GB  # Set to your desired amount of memory
#SBATCH --cpus-per-task=32  # Set to your desired number of CPUs
#SBATCH --gpus=a100_1g.10gb  # Request one GPU unit on the reserved node

Two GPU units on a single node

1
2
3
4
5
6
7
8
#!/bin/bash
#SBATCH --account=data-machine  # Run under the data machine buy-in
#SBATCH --nodelist=acm-[048-049,070-071],nal-[004-007]  # Force job to run on data machine nodes
#SBATCH --nodes=1  # Reserve only one node
#SBATCH --time=4:00:00  # Reserve for four hours (or your desired amount of time)
#SBATCH --mem=256GB  # Set to your desired amount of memory
#SBATCH --cpus-per-task=32  # Set to your desired number of CPUs
#SBATCH --gpus=a100_1g.10gb:2  # Request two GPU units on the reserved node

Using the fast NVME storage

You can preload your data into local NVME storage using "burst buffers". SLURM will move the data you want to use into NVME storage before your job starts.

Requesting a node

At the moment, burst buffers work best when requesting one specific node in the data machine. This ensures that the time SLURM takes to move your data does not count against the time you reserve the node for.

However, be careful which node you pick. If this node is busy, SLURM will wait until it is available to assign it to you. You can use the

1
buyin_status --account data-machine

to see the current usage of the Data Machine nodes.

Example burst buffer resource specification

In this example, we'll assume that we don't need a GPU and choose acm-048.

1
2
3
4
5
6
7
8
#!/bin/bash
#SBATCH --account=data-machine  # Run under the data machine buy-in
#SBATCH --nodelist=acm-048  # Restrict to a specific data machine node
#SBATCH --nodes=1  # Reserve only one node
#SBATCH --time=4:00:00  # Reserve for four hours (or your desired amount of time)
#SBATCH --memory=256GB  # Set to your desired amount of memory
#SBATCH --cpus-per-task=128  # Set to your desired number of CPUs
#BB source=/mnt/home/<username>/important/data/here

Using the local data

SLURM sets an environment variable BB_DATA with the location of your data on the local NVME storage. Use this directory to access your data with less latency than the home, research, or scratch space where it originally came from.

Saving data written to local storage

Usually, if you edit data on local storage, your changes will be lost after the job ends. However, if you add the specification resync=true to the #BB line in your submission script, that data will be copied to the location it was originally taken from after the job ends.

Example:

1
2
3
4
5
6
7
8
#!/bin/bash
#SBATCH --account=data-machine  # Run under the data machine buy-in
#SBATCH --nodelist=acm-048  # Restrict to a specific data machine node
#SBATCH --nodes=1  # Reserve only one node
#SBATCH --time=4:00:00  # Reserve for four hours (or your desired amount of time)
#SBATCH --memory=256GB  # Set to your desired amount of memory
#SBATCH --cpus-per-task=128  # Set to your desired number of CPUs
#BB source=/mnt/home/<username>/important/data/here resync=true

Debugging burst buffer issues

To check on the status of your submitted jobs, use the command

1
squeue --me

The NODELIST(REASON) column of the output may give information relevant to burst buffer steps, e.g.,

1
2
3
   JOBID PARTITION     NAME     USER    STATE       TIME TIME_LIMI  NODES NODELIST(REASON)
24411390 data-mach data_mac grosscra  PENDING       0:00   1:00:00      1 (BurstBufferResources)
24411197 data-mach data_mac grosscra  PENDING       0:00   1:00:00      1 (burst_buffer/lua: slurm_bb_data_in: )

A job with the BurstBufferResources reason is waiting for a node to run on and begin transferring resources. In the example above, a job is running the slurm_bb_data_in, i.e., it is transferring the data to the node.

For more information or if there are problems with the burst buffer specification, use the command

1
scontrol show job <jobid>

For example, running scontrol show job 24411197 while the job above was transferring data, the output ends with

1
2
3
4
...
BurstBuffer=#BB source=/mnt/home/grosscra/scripts
BurstBufferState=staging-in
...

Often the Comment field in the scontrol show job <jobid> output can give helpful burst buffer information.