Using Conda
Attention
This wiki serves as a very limited introduction. ICER strongly
recommendeds reading the
conda user guide
specific to the version of interest before you start using conda
on the HPCC.
Introduction
Conda is an open-source package management system that installs and updates packages and their dependencies. Conda also easily creates, saves, loads, and switches between environments on the HPCC. It was created for Python programs but it can package and distribute software for any language as a collection of 1,000+ open-source packages with free community support. The conda package and environment manager is included in all versions of Anaconda.
Anaconda was built to complement the rich, open-source Python community, the Anaconda platform provides an enterprise-ready data analytics platform that empowers researchers to adopt a modern, open data science analytics architecture.
Conda on the HPCC
Using Conda on the HPCC requires both a default component installed and maintained by the HPCC administrators via the Software module system and a user installed and maintained component. This dual component configuration is required to ensure system wide compliance for the user's customized Conda environments. Hence, users must install Anaconda in the home or research space to have full control of their Conda managed environments and packages; and users must also load the HPCC administrated Conda/3 module to allow Conda managed environments and packages to run smoothly on the HPCC.
Installing Anaconda for Users
Anaconda Installation Script
To install the user managed component of Anaconda on the HPCC visit www.anaconda.com/download/#linux and follow the instructions below:
-
Find the link for the 64-bit (x86) Installer for Linux and copy the URL associated with this link (right-click on the download link and select "copy link address" or "copy link location"); For example, https://repo.anaconda.com/archive/Anaconda3-2022.05-Linux-x86_64.sh.
-
Login to the HPCC, login to a development node, navigate to the desired installation directory if it is somewhere other than your homespace, then run the command
curl -O <copied link address>
where<copied link address>
above is to be replaced with the URL obtained in step 1. -
Once the Anaconda file is downloaded, run the command
bash <AnacondaFileName.sh>
where<AnacondaFileName.sh>
above is replaced with the name of the Anaconda file downloaded with thecurl
command. For example,Anaconda3-2022.05-Linux-x86_64.sh
.During Installation you will need to:
- Accept the license terms; the output will display
Do you accept the license terms [yes|no]?
Typeyes
to accept; you must agree to install Anaconda - Choose the installation location; the output will display
Anaconda3 will now be installed into this location: $HOME/anaconda3
- Press ENTER to confirm the location
- Press CTRL-C to abort the installation
- Or specify a different location below
(note: it may take a long while to complete this step) - Choose to initialize conda; the output will display
Do you wish the installer to initialize Anaconda3
by running conda init? [yes|no]
[no] >>>
Please typeno
, a more careful initialization is described after the installation instructions.
- Accept the license terms; the output will display
Note
Please remember the directory where Anaconda was installed. This installation path will be used next.
Upon successful installation, the output will display
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
Note
Please disregard this output. Users must manually configure Anaconda as described in the next section.
Editing .bashrc
To avoid conflicts between the user-installed Anaconda distribution
and system-installed Python distributions, a modification of the $HOME/.bashrc
file is necessary. The .bashrc
file in the user's home space
can be modified to set a specified environment every time you login to an
HPCC node. You can modify the .bashrc
file with an editor such as vim
or nano
.
To modify the .bashrc
file for Anaconda installations please follow these steps:
-
Navigate to your home space and open the
.bashrc
file with an editor e.g., run
cd $HOME
followed byvim .bashrc
. Once in the vim editor, press the [i] key to enter "--insert--" mode -
Set the variable "
CONDA3PATH
" to the Anaconda3 installation directory by adding the command lineexport CONDA3PATH=<Anaconda3 installation path>
. Make sure to include the final/
at the end of the path. -
Save the modified
.bashrc
file by pressing the [esc] key to exit "--insert--" mode; followed by:wq
to save and quit vim.
Warning
If there was a block of code in your .bashrc
file that begins with >>> conda init >>>
and ends with <<< conda init <<<
and these lines were not commented out (i.e., they did not begin with #
s), please run conda init --reverse
to remove these lines. Alternatively, you may comment them out yourself.
Run logout
to exit your SSH session on the development node, then reconnect with SSH to enable your changes.
Loading the Conda Module
As stated in the section Conda on the HPCC users must load the HPCC administered Conda module to ensure system-wide compliance when using and managing Conda environments and packages. To load the Conda/3 module login to a dev node and run the command
module load Conda/3
.
If you would like to automatically load the Conda/3 module upon login, add the command
module load Conda/3 2> /dev/null
to the .bashrc
file after the export CONDA3PATH=...
command
Note
Loading the Conda/3 module will also replace any loaded previously Python module to avoid conflicts.
Managing Conda
To ensure conda is properly installed and determine the
installed version, use the command
$ conda --version
If properly installed, the conda version will be output
to the display. For example,
$ conda 22.11.0
.
To update conda to the most recent version, use the command
$ conda update conda
The output will display text similar to
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
|
Type y
to continue with the update.
Managing Environments
To create a conda environment, use the command
conda create --name <environment_name>
where the text <environment_name>
is to be replaced with the
name you choose. For example, to create an environment named "myenv" use the command
conda create --name myenv
The output will display
1 2 3 4 5 6 7 8 9 10 |
|
Type y
to create the environment. No packages have been installed in
this environment yet.
To create a conda environment with pre-specified packages and/or versions of Python, use the command above with additional arguments. Here are a few examples that demonstrate the syntax:
Create an environment named "bioenv" with the "biopython" package
1 |
|
Create an environment named "scienv" with Python version 3.9 and version 1.6.0 of the "scipy" packages
1 |
|
Create an environment named "astroenv" with version 1.6.0 of the "scipy" packages, the current "asteroid" package, and the current "bable" packages
1 |
|
To copy an existing environment, use the command
conda create --name <new environment name> --clone <existing environment name>
For example, the command
conda create --name newenv --clone myenv
will create a new environment named "newenv" that contains the same packages as the existing environment "myenv".
To display a list of all conda environments, use the command
conda info --envs
The output will display
1 2 3 4 5 6 7 8 |
|
where an active environment is denoted with the *
symbol.
To activate a conda environment, use the command
conda activate <environment_name>
The current environment should be in (parentheses) in front of the command prompt. For example, the command
conda activate astroenv
will result in the new command prompt (astroenv) $
.
To switch to another environment, just use the conda activate
command with the new environment name.
To deactivate the current conda environment, use the command
conda deactivate
To remove an environment, use the command
conda remove --name <name of environment> --all
Managing Packages
To list all packages currently installed in an environment, first activate the environment then use the command
conda list
For example, after running the commands
conda activate scienv
conda list
the output will display
1 2 3 4 5 6 7 8 9 10 11 |
|
To search for a package in the Anaconda repository that you would like to install, use the command
conda search <package name>
where the <package name>
is to be replaced with the name of package to search for. For example, the command
conda search beautifulsoup4
results in the output display
1 2 3 4 5 6 7 8 |
|
To install a package in the active environment, use the command
conda install <package name>
where <package name>
is to be replaced with the name of package to install. For example, the command
(scienv) $ conda install beautifulsoup4
will output the display
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
Type ‘y’ to install the package.
To install a package in another environment, use the command
conda install --name <environment name> <package name>
where <environment name>
is to be replaced with the name of the target environment and <package name>
is to be replaced with the name of package to install. For example, the command
(scienv) $ conda install --name myenv beautifulsoup4
will install the package "beautifulsoup4" in the inactive environment "myenv".
Note
It is best to install all packages at once, so that all dependencies are installed at the same time.
**Not all packages can be installed with the simple command ** conda install
. Some packages may reside in a private package repository hosted by
Anaconda.org. As an example, we will illustrate the process using a public package called ‘bottleneck’.
Use a web browser to visit the webpage anaconda.org and enter the text "bottleneck" into the search bar.
Note
To search for packages in a private repository you will have to 'Sign Up' and 'Sign In'.
Choose the appropriate version, here we choose the most frequently downloaded, and click on the text ‘bottleneck’.
This will display all the information available on the package, including the commands used to install it. In this case, we want to install the standard "bottleneck" package via the "conda-forge" channel so we choose the command
conda install -c conda-forge bottleneck
where the -c
flag designates the channel.
If packages you are interested in are not available from conda or Anaconda.org use the 'pip' package manger within a conda environment via the command
pip install <package name>
where <package name>
is to be replaced by the name of the desired package.
For example, the commands
1 2 |
|
will install the package "see" in the active environment "scienv".
To update a package use the command
conda update <package name>
.
To remove a package from the active environment, use the command
conda remove <package name>
where <package name>
is to be replaced by the name of the package to be removed.
To remove a package from another environment, use the command
conda remove --name <environment name> <package name>
where <environment name>
is to be replaced with the name of the target environment and <package name>
is to be replaced with the name of package to be removed. For example, the command
(scienv) $ conda remove --name myenv beautifulsoup4
will remove the package "beautifulsoup4" in the inactive environment "myenv".
Using conda with SLURM
You can activate a conda environment from within a SLURM Job Script. Include the conda activate <environment name>
and conda deactivate
commands in the 'bash command' portion of the SLURM job script. Ensure to first navigate into the directory where Anaconda is installed e.g., cd $HOME
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|