Skip to content

ABySS

ABySS is a de novo, parallel, paired-end sequence assembler. It can run as an MPI job in the HPCC cluster. The latest version currently installed on the HPCC is 2.1.5, which can be loaded by

module load ABySS/2.1.5

You can optionally load other tools as needed, provided that they have been installed under the same toolchain environment as ABySS/2.1.5. For example,

module load BEDTools/2.27.1 SAMtools/1.9 BWA/0.7.17

is valid after you've loaded ABySS.

A sample SLURM script is below.

#!/bin/bash

#SBATCH --job-name=abyss_test  
#SBATCH --nodes=4  
#SBATCH --ntasks-per-node=2  
#SBATCH --mem-per-cpu=5G  
#SBATCH --time=1:00:00  
#SBATCH --output=%x-%j.SLURMout

echo "This script is from ICER's ABySS tutorial."

echo "$SLURM_JOB_NODELIST"

module load ABySS/2.1.5

export OMPI_MCA_mpi_warn_on_fork=0  
export OMPI_MCA_mpi_cuda_support=0

abyss-pe k=25 name=test in='/mnt/research/common-data/Bio/ABySS/test-data/reads1.fastq /mnt/research/common-data/Bio/ABySS/test-data/reads2.fastq' v=-v np=8 j=2

This script launches an MPI job by requesting 8 processes; they are distributed on 4 nodes (--nodes=4) with two processes each (--ntasks-per-node=2). Accordingly, in the abyss-pe command line, we specify np=8. Regarding parameter j, the manual states

The paired-end assembly stage is multithreaded, but must run on a single machine. The number of threads to use may be specified with the parameter j. The default value for j is the value of np.

So, rather than using np as the default value for j, we set j = 2 which is the number of CPUs per node as requested (in this case "task" is equivalent to CPU). To submit the job,

sbatch --constraint="[intel16|intel18]"

While the job is running, you may look at the SLURM output file, in this example, abyss_test-<job ID>.SLURMout, which has a lot of running log, including the following:

Running on 8 processors
6: Running on host lac-391
0: Running on host lac-194
2: Running on host lac-225
4: Running on host lac-287
7: Running on host lac-391
3: Running on host lac-225
1: Running on host lac-194
5: Running on host lac-287