Skip to content

Alphafold via Singularity

Alphafold can be run via Singularity.

Alphafold database is located in /mnt/research/common-data/alphafold/database.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
database
├── bfd
│   ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_a3m.ffdata
│   └── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt.tar.gz
├── mgnify
│   └── mgy_clusters_2018_12.fa
├── params
│   ├── params_model_1.npz
│   ├── params_model_1_ptm.npz
│   ├── params_model_2.npz
│   ├── params_model_2_ptm.npz
│   ├── params_model_3.npz
│   ├── params_model_3_ptm.npz
│   ├── params_model_4.npz
│   ├── params_model_4_ptm.npz
│   ├── params_model_5.npz
│   └── params_model_5_ptm.npz
├── pdb70
│   ├── md5sum
│   ├── pdb70_a3m.ffdata
│   ├── pdb70_a3m.ffindex
│   ├── pdb70_clu.tsv
│   ├── pdb70_cs219.ffdata
│   ├── pdb70_cs219.ffindex
│   ├── pdb70_hhm.ffdata
│   ├── pdb70_hhm.ffindex
│   └── pdb_filter.dat
├── pdb_mmcif
│   ├── mmcif_files
│   └── obsolete.dat
├── small_bfd
│   └── bfd-first_non_consensus_sequences.fasta
├── uniclust30
│   └── uniclust30_2018_08
└── uniref90
    ├── uniref90.fasta
    └── uniref90.fasta.1.gz

Before running Alphafold, you need to set

1
2
export ALPHAFOLD_DATA_PATH="/mnt/research/common-data/alphafold/database"  
export ALPHAFOLD_MODELS="/mnt/research/common-data/alphafold/database/params"

To run alphafold, please use the following template (for more information about options/flags, please refer to https://github.com/deepmind/alphafold.

In the script, input.fasta is your input data, and you need to set up output_dir. Since the command /usr/bin/hhsearch inside the container does not work on intel14 nodes (Illegal instruction), please use the SBATCH option --constraint in the job script.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#!/bin/bash
#SBATCH --job-name alphafold-run
#SBATCH --time=08:00:00
#SBATCH --gres=gpu:1
#SBATCH --cpus-per-task=8
#SBATCH --mem=20G
#SBATCH --constraint="[intel16|intel18|amd20]"

export ALPHAFOLD_DATA_PATH="/mnt/research/common-data/alphafold/database"
export ALPHAFOLD_MODELS="/mnt/research/common-data/alphafold/database/params"

singularity run --nv \
-B $ALPHAFOLD_DATA_PATH:/data \
-B $ALPHAFOLD_MODELS \
-B .:/etc \
--pwd  /app/alphafold /opt/software/alphafold/2.0.0/alphafold.sif \
--data_dir=/data \
--output_dir=/mnt/gs18/scratch/users/my_id/alphafold/output \
--fasta_paths=/mnt/gs18/scratch/users/my_id/alphafold/input.fasta  \
--uniref90_database_path=/data/uniref90/uniref90.fasta  \
--mgnify_database_path=/data/mgnify/mgy_clusters_2018_12.fa   \
--bfd_database_path=/data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \
--uniclust30_database_path=/data/uniclust30/uniclust30_2018_08/uniclust30_2018_08 \
--pdb70_database_path=/data/pdb70/pdb70  \
--template_mmcif_dir=/data/pdb_mmcif/mmcif_files  \
--obsolete_pdbs_path=/data/pdb_mmcif/obsolete.dat \
--max_template_date=2020-05-14   \
--model_names=model_1 \
--preset=casp14