Skip to content

Warning

This is as a Lab Notebook which describes how to solve a specific problem at a specific time. Please keep this in mind as you read and use the content. Please pay close attention to the date, version information and other details.

Extra

The 2.3.1 version of the AlphaFold singularity image is currently non-functional on HPCC. As of Ocotober, 2023, the image is no longer able to find the current libcusolver share library file. Please use the 2.3.2 version of the image from this point forward. Please see the new documentation

Lab Notebook --- Instructions for AlphaFold version 2.3.1, Singulairty (2023-07-18)(DEPCREATED, see warnings)

These instructions are a work in progression for running AlphaFold version 2.3.1 using the singularity container found at:

1
/opt/software/alphafold/2.3.1/alphafold_2.3.1_latest.sif

As with other containers in the /opt/software/alphafold/ directory, AlphaFold 2.3.1 can be run via Singularity.

Howevever, AlphaFold version after 2.3.0 use a database which is formatted differently than pevious versions. This database is located in /mnt/research/common-data/alphafold/database_230.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
database_230
── bfd -> ../database/bfd/
├── mgnify
│   ├── mgy_clusters_2022_05.fa
│   └── mgy_clusters.fa -> mgy_clusters_2022_05.fa
├── params
│   ├── LICENSE
│   ├── params_model_1_multimer_v3.npz
│   ├── params_model_1.npz
│   ├── params_model_1_ptm.npz
│   ├── params_model_2_multimer_v3.npz
│   ├── params_model_2.npz
│   ├── params_model_2_ptm.npz
│   ├── params_model_3_multimer_v3.npz
│   ├── params_model_3.npz
│   ├── params_model_3_ptm.npz
│   ├── params_model_4_multimer_v3.npz
│   ├── params_model_4.npz
│   ├── params_model_4_ptm.npz
│   ├── params_model_5_multimer_v3.npz
│   ├── params_model_5.npz
│   └── params_model_5_ptm.npz
├── pdb70
│   ├── md5sum
│   ├── pdb70_a3m.ffdata
│   ├── pdb70_a3m.ffindex
│   ├── pdb70_clu.tsv
│   ├── pdb70_cs219.ffdata
│   ├── pdb70_cs219.ffindex
│   ├── pdb70_hhm.ffdata
│   ├── pdb70_hhm.ffindex
│   └── pdb_filter.dat
├── pdb_mmcif
│   ├── mmcif_files
│      ├── 100d.cif
│      ├── 101d.cif
│      ├── 101m.cif
│      ├── 102d.cif
│      ├── 102l.cif
│   |   ...
│   └── obsolete.dat
├── pdb_seqres
│   └── pdb_seqres.txt
├── small_bfd -> ../database/small_bfd/
├── uniprot
│   └── uniprot.fasta
├── uniref30
│   ├── UniRef30_2021_03_a3m.ffdata
│   ├── UniRef30_2021_03_a3m.ffindex
│   ├── UniRef30_2021_03_cs219.ffdata
│   ├── UniRef30_2021_03_cs219.ffindex
│   ├── UniRef30_2021_03_hhm.ffdata
│   ├── UniRef30_2021_03_hhm.ffindex
│   ├── UniRef30_2021_03.md5sums
│   └── UniRef30_2021_03.tar.1.gz
└── uniref90
    └── uniref90.fasta

Before running AlphaFold 2.3.1, you need to set

1
2
export ALPHAFOLD_DATA_PATH="/mnt/research/common-data/alphafold/database"  
export ALPHAFOLD_MODELS="/mnt/research/common-data/alphafold/database/params"

We also recommend disbaling local Python libraries with the following argument to avoid conflicts with the Python installation within the Singularity container

1
export PYTHONNOUSERSITE=1

Both the database and Python library parameters will be set automatically if you load the module "alphafold/2.3.1"

To run Alphafold 2.3.1 via SLURM, please use the following template (for more information about options/flags, please refer to the README on Github).

In the script, input.fasta is your input data, and you need to set up output_dir. Since the command /usr/bin/hhsearch inside the container does not work on intel14 nodes (Illegal instruction), please use the SBATCH option --constraint in the job script.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#!/bin/bash
#SBATCH --job-name alphafold-run
#SBATCH --time=08:00:00
#SBATCH --gres=gpu:1
#SBATCH --cpus-per-task=8
#SBATCH --mem=20G
#SBATCH --constraint="[intel16|intel18|amd20]"


echo "Export AlphaFold variables"

export ALPHAFOLD_DATA_PATH="/mnt/research/common-data/alphafold/database_230"
export ALPHAFOLD_MODELS="/mnt/research/common-data/alphafold/database_230/params"
export PYTHONNOUSERSITE=1
export CUDA_VISIBLE_DEVICES=0
export NVIDIA_VISIBLE_DEVICES="${CUDA_VISIBLE_DEVICES}"

echo "Start Singularity run"
singularity run --nv \
-B $ALPHAFOLD_DATA_OLD_PATH \
-B $ALPHAFOLD_DATA_PATH:/data \
-B $ALPHAFOLD_MODELS \
-B .:/etc \
--pwd  /app/alphafold /opt/software/alphafold/2.3.1/alphafold_2.3.1_latest.sif \ 
--data_dir=/data \
--output_dir=output_dir \
--fasta_paths=input.fasta  \
--uniref90_database_path=/data/uniref90/uniref90.fasta  \
--uniref30_database_path=/data/uniref30/UniRef30_2021_03 \
--mgnify_database_path=/data/mgnify/mgy_clusters.fa \
--bfd_database_path=/mnt/research/common-data/alphafold/database/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \
--template_mmcif_dir=/data/pdb_mmcif/mmcif_files  \
--obsolete_pdbs_path=/data/pdb_mmcif/obsolete.dat \
--pdb_seqres_database_path=/data/pdb_seqres/pdb_seqres.txt \
--uniprot_database_path=/data/uniprot/uniprot.fasta \
--max_template_date=2020-05-14   \ # Update the template date if need be
--model_preset=monomer \
--use_gpu_relax=true