Using Conda
Attention
This wiki serves as a very limited introduction. ICER strongly
recommendeds reading the
conda user guide
specific to the version of interest before you start using conda
on the HPCC.
Introduction
For most Python users, installing packages is done simply with the command pip
install AwesomePackage
on the terminal, or similar in Jupyter Notebooks.
However for complex research software, it's sometimes not that simple. There may be other software and libraries that must be installed or part of the operating system for a python package to work that need to be installed on the computer, so-called "dependencies," and the versions must all be compatible. This is not just true of Python -- most systems have a "package management" system that works with recipes for installing the correct versions of many programs. The Anaconda company contributed the great open source Conda package manager for Python and scientific software and it's been very successful.
Specifically, Conda is an open-source package management system that installs and updates packages and their dependencies. Conda also easily creates, saves, loads, and switches between environments on the HPCC. It was created for Python programs but it can package and distribute software for any language as a collection of 1,000+ open-source packages with free community support.
Miniforge is a minimal installer that provides access to Conda and a large set of community-contributed packages via conda-forge. Miniforge and Conda are both available with a permissive license.
You may also be familiar with Anaconda, which also distributes Conda and performs a similar role as Miniforge. However Anaconda uses a different installer and provides a curated list of packages. Anaconda is not available under a permissive license.
ICER recommends using Miniforge. However the two distributions act
similarly and will give access to the same conda
program.
There are two methods for using Conda on the HPCC. The first is using the Miniforge3
module. The second is to install Miniforge manually, and is detailed in the appendix.
Using the Miniforge3 Module (Quick Start Option)
To access the conda
command, first unload other modules to ensure you do not conflict with other versions of Python. Then load the Miniforge3 module:
1 2 |
|
When you create Conda environments and install packages, they will be installed in your home directory under the path /mnt/home/$USER/.conda/envs/myenv
.
Using module swap
You can also use module swap
to get access to the Miniforge3
module while keeping other modules loaded. However, swapping with the Python
module by itself is not enough. The better option is to use
1 |
|
This ensures that no extra Python packages installed by ICER are accessible from your Miniforge environments.
Managing Environments
To create a conda environment, use the command
conda create --name <environment_name>
where the text <environment_name>
is to be replaced with the
name you choose. For example, to create an environment named "myenv" use the command
conda create --name myenv
The output will display
output | |
---|---|
1 2 3 4 5 6 7 8 9 10 |
|
Type y
to create the environment. No packages have been installed in
this environment yet.
To create a conda environment with pre-specified packages and/or versions of Python, use the command above with additional arguments. Here are a few examples that demonstrate the syntax:
Create an environment named "bioenv" with the "biopython" package
1 |
|
Create an environment named "scienv" with Python version 3.9 and version 1.6.0 of the "scipy" packages
1 |
|
Create an environment named "astroenv" with version 1.6.0 of the "scipy" packages, the current "asteroid" package, and the current "bable" packages
1 |
|
To copy an existing environment, use the command
conda create --name <new environment name> --clone <existing environment name>
For example, the command
conda create --name newenv --clone myenv
will create a new environment named "newenv" that contains the same packages as the existing environment "myenv".
To display a list of all conda environments, use the command
conda info --envs
The output will display
output | |
---|---|
1 2 3 4 5 6 7 8 |
|
where an active environment is denoted with the *
symbol.
To activate a conda environment, use the command
conda activate <environment_name>
The current environment should be in (parentheses) in front of the command prompt. For example, the command
conda activate astroenv
will result in the new command prompt (astroenv) $
.
To switch to another environment, just use the conda activate
command with the new environment name.
To deactivate the current conda environment, use the command
conda deactivate
To remove an environment, use the command
conda remove --name <name of environment> --all
Managing Packages
To list all packages currently installed in an environment, first activate the environment then use the command
conda list
For example, after running the commands
conda activate scienv
conda list
the output will display
output | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 |
|
To search for a package in the conda-forge repository that you would like to install, use the command
conda search <package name>
where the <package name>
is to be replaced with the name of package to search for. For example, the command
conda search beautifulsoup4
results in the output display
output | |
---|---|
1 2 3 4 5 6 7 8 |
|
To install a package in the active environment, use the command
conda install <package name>
where <package name>
is to be replaced with the name of package to install. For example, the command
(scienv) $ conda install beautifulsoup4
will output the display
output | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
Type ‘y’ to install the package.
To install a package in another environment, use the command
conda install --name <environment name> <package name>
where <environment name>
is to be replaced with the name of the target environment and <package name>
is to be replaced with the name of package to install. For example, the command
(scienv) $ conda install --name myenv beautifulsoup4
will install the package "beautifulsoup4" in the inactive environment "myenv".
Note
It is best to install all packages at once, so that all dependencies are installed at the same time.
**Not all packages can be installed with the simple command ** conda install
.
Some packages may reside in other repositories or "channels". For example,
Bioconda is a popular repository for tools
useful in biomedical research. To install a packages using a different channel,
use the -c
option like
conda install -c bioconda samtools
where the -c
flag designates the channel.
Bioconda specific setup
If you plan to use Bioconda often, they suggest that you run the following one-time setup to ensure that you can install compatible packages into your environment:
1 2 3 4 |
|
See the Bioconda homepage for details.
If packages you are interested in are not available from conda or another conda repository that you have access to use the 'pip' package manger within a conda environment via the command
pip install <package name>
where <package name>
is to be replaced by the name of the desired package.
For example, the commands
1 2 |
|
will install the package "see" in the active environment "scienv".
Avoid mixing conda
and pip
installed packages
When creating an environment Conda, it is highly recommended to install all
packages with conda
only and do not pip
. If you need to install
packages with both tools, install all conda
packages first, then install
pip
packages. This ensures that the conda
installed packages have the
correct dependencies and setup.
To update a package use the command
conda update <package name>
.
To remove a package from the active environment, use the command
conda remove <package name>
where <package name>
is to be replaced by the name of the package to be removed.
To remove a package from another environment, use the command
conda remove --name <environment name> <package name>
where <environment name>
is to be replaced with the name of the target environment and <package name>
is to be replaced with the name of package to be removed. For example, the command
(scienv) $ conda remove --name myenv beautifulsoup4
will remove the package "beautifulsoup4" in the inactive environment "myenv".
Using conda with SLURM
You can activate a conda environment from within a SLURM Job Script. Include the conda activate <environment name>
and conda deactivate
commands in the 'bash command' portion of the SLURM job script. Ensure to first navigate into the directory where Miniforge is installed e.g., cd $HOME
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
Appendix: Installing Miniforge Manually (Advanced Option)
You may wish to install Miniforge yourself rather than using the version on the HPCC (e.g., to use a newer version).
Doing so requires both a default component installed and maintained by the HPCC administrators via the Software module system and a user installed and maintained component. This dual component configuration is required to ensure system wide compliance for the user's customized Conda environments. Hence, if you are not using the Miniforge3
module, users must install Miniforge in the home or research space to have full control of their Conda managed environments and packages; and users must also load the HPCC administrated Conda/3 module to allow Conda managed environments and packages to run smoothly on the HPCC.
Miniforge Installation Script
To install the user managed component of Miniforge on the HPCC visit https://conda-forge.org/miniforge/ and follow the instructions below:
-
Find the link for the 64-bit (x86_64) Installer for Minforge3 for Linux and copy the URL associated with this link (right-click on the download link and select "copy link address" or "copy link location"); For example, https://github.com/conda-forge/miniforge/releases/download/24.3.0-0/Miniforge3-24.3.0-0-Linux-x86_64.sh.
-
Login to the HPCC, login to a development node, navigate to the desired installation directory if it is somewhere other than your homespace, then run the command
wget <copied link address>
where<copied link address>
above is to be replaced with the URL obtained in step 1. -
Once the Miniforge file is downloaded, run the command
bash <MiniforgeFileName.sh>
where<MiniforgeFileName.sh>
above is replaced with the name of the Miniforge file downloaded with thecurl
command. For example,Miniforge3-24.3.0-0-Linux-x86_64.sh
.During Installation you will need to:
- Accept the license terms; the output will display
Do you accept the license terms [yes|no]?
Read the license, and if you agree, typeyes
to accept. - Choose the installation location; the output will display
Miniforge3 will now be installed into this location: /mnt/home/$USER/miniforge3
- Press ENTER to confirm the location
- Press CTRL-C to abort the installation
- Or specify a different location below
(note: it may take a long while to complete this step) - Choose to initialize conda; the output will display
Do you wish the installer to initialize conda?
by running conda init? [yes|no]
[no] >>>
Please typeno
, a more careful initialization is described after the installation instructions.
- Accept the license terms; the output will display
Note
Please remember the directory where Miniforge was installed. This installation path will be used next.
Upon successful installation, the output will display
output | |
---|---|
1 2 3 4 5 6 7 8 9 10 |
|
Note
Please disregard this output. Users must manually configure Miniforge as described in the next section.
Editing .bashrc
To avoid conflicts between the user-installed Miniforge distribution
and system-installed Python distributions, a modification of the $HOME/.bashrc
file is necessary. The .bashrc
file in the user's home space
can be modified to set a specified environment every time you login to an
HPCC node. You can modify the .bashrc
file with an editor such as vim
or nano
.
To modify the .bashrc
file for Miniforge installations please follow these steps:
-
Navigate to your home space and open the
.bashrc
file with an editor e.g., run
cd $HOME
followed byvim .bashrc
. Once in the vim editor, press the [i] key to enter "--insert--" mode -
Set the variable "
CONDA3PATH
" to the Miniforge installation directory by adding the command lineexport CONDA3PATH=<Miniforge3 installation path>
. Make sure to include the final/
at the end of the path. -
Save the modified
.bashrc
file by pressing the [esc] key to exit "--insert--" mode; followed by:wq
to save and quit vim.
Warning
If there was a block of code in your .bashrc
file that begins with # >>> conda initialize >>>
and ends with # <<< conda initialize <<<
and the lines between are not commented out (i.e., they did not begin with #
s), please run conda init --reverse
to remove these lines. Alternatively, you may comment them out yourself.
Run logout
to exit your SSH session on the development node, then reconnect with SSH to enable your changes.
Loading the Conda Module
Warning
Loading the Conda/3 module requires that you first unload any Python modules with module unload Python-bundle-PyPI
or module purge
.
As stated above users must load the HPCC administered Conda module to ensure system-wide compliance when using and managing Conda environments and packages. To load the Conda/3 module login to a dev node and run the command
1 |
|
If you would like to automatically load the Conda/3 module upon login, add the command
module swap Python-bundle-PyPI Conda/3
to the .bashrc
file after the export CONDA3PATH=...
command
Difference between Conda/3
and Miniforge3
modules
If you have installed conda manually, use module load Conda/3
. If you are
using the one installed on the HPCC, use module load Miniforge3
. As you
read through ICER's documentation, you may see either module being used.
Substitute for the version that you have chosen.
Managing User-Installed Conda
To ensure conda is properly installed and determine the
installed version, use the command
$ conda --version
If properly installed, the conda version will be output
to the display. For example,
$ conda 22.11.0
.
Conflict with locally installed packages
If you have used the pip install
command to install packages before using Conda, you may find conflicts between the packages Conda uses and the packages locally installed into the $HOME/.local/lib/pythonX.Y
directory.
For example, after running a conda
command for the first time, you may see an error like
1 2 3 4 5 |
|
You will need to move your locally installed packages to a separate location. The final line of the errors shows where the user installed packages are. You can move them to a backup location if you need them later using a command like:
1 |
|
Make sure to replace python3.10
above with the version that shows up in your error message. After this, you will no longer be able to use your locally installed packages. For this reason ICER recommends not mixing Conda installed packages and locally installed packages.
To update conda to the most recent version, use the command
$ conda update conda
The output will display text similar to
output | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
|
Type y
to continue with the update.