Skip to content

AlphaFold via Singularity

Note

These instructions are for AlphaFold Singularitiy container for version 2.2.2 and earlier. If you are using an AlphFold 2.3.0 or later container, please see the update instructions at https://docs.icer.msu.edu/2023-07-18_LabNotebook_AlphaFold2.3.1_WorkInProgress/

AlphaFold can be run via Singularity.

AlphaFold database is located in /mnt/research/common-data/alphafold/database.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
database
├── bfd
│   ├── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_a3m.ffdata
│   └── bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt.tar.gz
├── mgnify
│   └── mgy_clusters_2018_12.fa
├── params
│   ├── params_model_1.npz
│   ├── params_model_1_ptm.npz
│   ├── params_model_2.npz
│   ├── params_model_2_ptm.npz
│   ├── params_model_3.npz
│   ├── params_model_3_ptm.npz
│   ├── params_model_4.npz
│   ├── params_model_4_ptm.npz
│   ├── params_model_5.npz
│   └── params_model_5_ptm.npz
├── pdb70
│   ├── md5sum
│   ├── pdb70_a3m.ffdata
│   ├── pdb70_a3m.ffindex
│   ├── pdb70_clu.tsv
│   ├── pdb70_cs219.ffdata
│   ├── pdb70_cs219.ffindex
│   ├── pdb70_hhm.ffdata
│   ├── pdb70_hhm.ffindex
│   └── pdb_filter.dat
├── pdb_mmcif
│   ├── mmcif_files
│   └── obsolete.dat
├── small_bfd
│   └── bfd-first_non_consensus_sequences.fasta
├── uniclust30
│   └── uniclust30_2018_08
└── uniref90
    ├── uniref90.fasta
    └── uniref90.fasta.1.gz

Before running AlphaFold, you need to set

1
2
export ALPHAFOLD_DATA_PATH="/mnt/research/common-data/alphafold/database"  
export ALPHAFOLD_MODELS="/mnt/research/common-data/alphafold/database/params"

To run alphafold, please use the following template (for more information about options/flags, please refer to the README on Github).

In the script, input.fasta is your input data, and you need to set up output_dir. Since the command /usr/bin/hhsearch inside the container does not work on intel14 nodes (Illegal instruction), please use the SBATCH option --constraint in the job script.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#!/bin/bash
#SBATCH --job-name alphafold-run
#SBATCH --time=08:00:00
#SBATCH --gpus=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=20G
#SBATCH --constraint="[intel16|intel18|amd20]"

export ALPHAFOLD_DATA_PATH="/mnt/research/common-data/alphafold/database"
export ALPHAFOLD_MODELS="/mnt/research/common-data/alphafold/database/params"

singularity run --nv \
-B $ALPHAFOLD_DATA_PATH:/data \
-B $ALPHAFOLD_MODELS \
-B .:/etc \
--pwd  /app/alphafold /opt/software/alphafold/2.0.0/alphafold.sif \
--data_dir=/data \
--output_dir=/mnt/gs18/scratch/users/my_id/alphafold/output \
--fasta_paths=/mnt/gs18/scratch/users/my_id/alphafold/input.fasta  \
--uniref90_database_path=/data/uniref90/uniref90.fasta  \
--mgnify_database_path=/data/mgnify/mgy_clusters_2018_12.fa   \
--bfd_database_path=/data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \
--uniclust30_database_path=/data/uniclust30/uniclust30_2018_08/uniclust30_2018_08 \
--pdb70_database_path=/data/pdb70/pdb70  \
--template_mmcif_dir=/data/pdb_mmcif/mmcif_files  \
--obsolete_pdbs_path=/data/pdb_mmcif/obsolete.dat \
--max_template_date=2020-05-14   \
--model_names=model_1 \
--preset=casp14