This is as a Lab Notebook which describes how to solve a specific problem at a specific time.
Please keep this in mind as you read and use the content. Please pay close attention to the date, version
information and other details
Lab Notebook --- Instructions for AlphaFold version 2.3.2, Singularity (2023-11-10)(WORK IN PROGRESS)
Currently, due to certain system limitations, AlphaFold version 2.3.0 and later cannot be installed on HPCC or run as Docker images.
Therefore, we are working to make Singularity images created by third-party user available on HPCC (see https://github.com/prehensilecode/alphafold_singularity). As of the writing of this Lab Notebook, the most current image is 2.3.2-1
These instructions are a work in progress for running AlphaFold version 2.3.2 using the singularity container found at:
As with other containers in the /opt/software/alphafold/ directory, AlphaFold 2.3.2 can be run via Singularity.
Howevever, AlphaFold version after 2.3.0 use a database which is formatted differently than pevious versions. This database is located in
/mnt/research/common-data/alphafold/database_230.
The alphafold/2.3.2 module will set the ALPHAFOLD_DIR, ALPHAFOLD_DATADIR, and ALPHAFOLD_MODELS environment variables for you.
We recommend you use the python script "run_singularity.py" (which is also in /opt/software/alphafold/2.3.2/) to work with the Singularity image. This script helps automate many of the more challenging parts of using the image, such as correctly binding paths to your data directories and enabling GPU support. Below is an example of how to run this script:
1 2 3 4 5 6 7 8 910
exportoutput_dir=<some_output_folder># Set the output directory as a enviroment variable as this is what the script expects
python3${ALPHAFOLD_DIR}/run_singularity.py\ --use_gpu\ #Use the GPU, which makes the neural network calculations faster--output_dir=$output_dir\ #Here is where I want to put the result--data_dir=${ALPHAFOLD_DATADIR}\ #Here is where the AlphaFold data like pdb sequences live--fasta_paths=input.fasta\ #Here is our input fasta sequence--max_template_date=2020-05-14\ #When looking for PDB templates, this is the maximum date we will consider--model_preset=monomer\ #We are predicting a monomeric protein, change to "multimer" for multimer --db_preset=reduced_dbs#Use the reduced database
Additionally, the --output_dir argument must be passed EXACTLY as above using an enviroment variable because the script expects an enviroment variable (i.e. the uses os.environ to fill in the outdir)
If you would like to submit an AlphaFold job to SLURM, we have included an example script below. Note that you will need to adust the resource requests (mainly time) depending on the complexity of your protein.
#!/bin/bash#SBATCH --job-name 2023alphafold#SBATCH --time=04:00:00#SBATCH --gres=gpu:1#SBATCH -C [nvf|nal|nif] ## We want the good GPUs#SBATCH --cpus-per-task=8#SBATCH --mem=12G#SBATCH -o 2023.log
moduleloadGCC/6.4.0-2.28OpenMPI/2.1.2Python/3.6.4
moduleloadalphafold/2.3.2
echo"Export AlphaFold variables"# These variables are now set by the moduleechoINFO:ALPHAFOLD_DIR=$ALPHAFOLD_DIRechoINFO:ALPHAFOLD_DATADIR=$ALPHAFOLD_DATADIRexportoutput_dir=$SLURM_SUBMIT_DIR/2023# you chnage this to whatever path you likecd$SLURM_SUBMIT_DIR
mkdir-p$output_dirtimestamp=$(date)echo"Starting AlphaFold at $timestamp"
python3${ALPHAFOLD_DIR}/run_singularity.py\--use_gpu\--output_dir=$output_dir\--data_dir=${ALPHAFOLD_DATADIR}\--fasta_paths=8IBQ.fasta\--max_template_date=2023-08-01\--model_preset=monomer\--db_preset=reduced_dbs
echoINFO:AlphaFoldreturned$?timestamp=$(date)echo"Finishing AlphaFold at $timestamp"
Running the singularity image manually
If for whatever reason you need to manually run AlphaFold from the singularity image, we recommend you still run "run_singularity.py" first as this script will print the "singularity run ..." command it generates. It will be much easier to work from this command, which should have all of the bind paths properly set for the image, than to try to write your own command from scratch.
Additional Resources and Acknowledgement
I would like to thank Dr. Josh Vermaas for helping me troubleshoot this new image and providing the example SLURM script.
For additional details about the Singularity image and run_singularity.py script, please see the Github of the original author: