Skip to content

Using Python in HPCC with virtualenv

Python applications usually use packages and modules that require specific versions of libraries. This means one installed application may conflict with another application due to using the same library but with different versions. It is difficult to meet the requirements of every application by one global Python installation. To resolve this issue, users can create an isolated virtual environment with a particular version of Python in a self-contained directory of their home or research space. Any package and the dependent libraries installed inside the directory can be available only through the virtual environment. Different applications can then use different virtual environments to avoid any conflict.

Creating virtual environments

For compatibility purposes please create virtual environments and install packages using the dev-amd20 development node:

ssh dev-amd20

To create python virtual environments, please load your preferred version of Python:

input
module purge
module load Python/3.11.3
module list Python
output
Currently Loaded Modules Matching: Python
  1) Python/3.11.3-GCCcore-12.3.0

It is also a good idea to create a directory labeled with the Python version to store corresponding virtual environments:

mkdir Python3.11.3
cd Python3.11.3

A virtual environment can be created by running python3 -m venv --copies <DIR>, where <DIR> is the directory of the created environment. For example, the following code creates a virtual environment named tutorial in the tutorial directory:

input
python -m venv --copies tutorial  
ls tutorial  
output
bin  include  lib  lib64  pyvenv.cfg  

The created tutorial environment can now be used by sourcing the script file activate under the bin directory:

source tutorial/bin/activate

Your prompt will begin with the title of the virtual environment, for example:

(tutorial) user@host:~/Python3.11.3$

To leave the environment, run deactivate:

deactivate

and the virtual environment name will disappear.

Use the --copies flag to ensure compatibility

If you have used venv before, you may not have seen the --copies flag. It is important to use this when creating a virtual environment on the HPCC so that you reduce dependence on the system environment when you create the virtual environment. This allows you to use your virtual environments on multiple different node types.

Activating virtual environments

To activate a previously created virtual environment (for example, after you log in, or in a job submission script), first load the corresponding module, then activate the virtual environment:

module purge
module load Python/3.11.3
source ~/Python3.11.3/tutorial/bin/activate

To deactivate the environment, run:

deactivate

Load the corresponding module

Make sure to load the same Python module you used to create the virtual environment. If you forget to load the module, or load a different version, you will likely receive an error like

python: error while loading shared libraries: libpython3.11.so.1.0: cannot open shared object file: No such file or directory

To help with this, we recommend creating virtual environments in a directory labeled with the version of Python used as described above.

Installing packages from PyPI using pip

The most common usage of pip is to install python packages from the Python Package Index with a requirement specifier. Below, some of the common usage scenarios are introduced.

Activate your virtual environment first!

Before running any of the commands below, make sure to activate your virtual environment, otherwise, your packages will be installed in the wrong location.

For compatibility purposes, install all packages using the dev-amd20 development node:

ssh dev-amd20

To install the latest version of a python package, run python -m pip install <Package Name>, for example, using sympy as <Package Name>:

input
# Activate your virtual environment first if not activated (see above)
python -m pip install sympy
output
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting sympy
  Downloading sympy-1.14.0-py3-none-any.whl (6.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.3/6.3 MB 99.0 MB/s eta 0:00:00
Collecting mpmath<1.4,>=1.1.0
  Downloading mpmath-1.3.0-py3-none-any.whl (536 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 kB 211.6 MB/s eta 0:00:00
Installing collected packages: mpmath, sympy
Successfully installed mpmath-1.3.0 sympy-1.14.0

To install a specific version of a python package, run python -m pip install <Package Name>==<Version Number>. For example, installing numpy with <version number> as 2.3.2:

input
# Activate your virtual environment first if not activated (see above)
python -m pip install numpy==2.3.2
output
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting numpy==2.3.2
  Downloading numpy-2.3.2-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (16.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 16.9/16.9 MB 118.5 MB/s eta 0:00:00
Installing collected packages: numpy
Successfully installed numpy-2.3.2

To upgrade an already installed package to the latest from PyPI, users can run python -m pip install --upgrade <Package Name>. For example, to upgrade the pip package itself:

input
# Activate your virtual environment first if not activated (see above)
python -m pip install --upgrade pip
output
Requirement already satisfied: pip in ./Python3.11.3/tutorial/lib/python3.11/site-packages (22.3.1)
Collecting pip
  Downloading pip-25.2-py3-none-any.whl (1.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 85.8 MB/s eta 0:00:00
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 22.3.1
    Uninstalling pip-22.3.1:
      Successfully uninstalled pip-22.3.1
Successfully installed pip-25.2

With pip, you can also list all installed packages and their versions with the command python -m pip freeze:

input
# Activate your virtual environment first if not activated (see above)
python -m pip freeze
output
mpmath==1.3.0
numpy==2.3.2
sympy==1.14.0

For more detail, see the pip docs, which includes a complete Reference Guide.

Reinstalling a virtual environment

To reinstall a virtual environment (e.g., to reinstall on a more compatible node type), first activate the old one, run python -m pip freeze, save the output to a file, and deactivate:

module purge
module load Python/3.11.3
source ~/Python3.11.3/tutorial/bin/activate
python -m pip freeze > requirements.txt
deactivate

Then create a new virtual environment following the instructions above. In this example, we'll call it new_tutorial. Activate it, and reinstall all packages in the requirements.txt file:

source ~/Python3.11.3/new_tutorial/bin/activate
python -m pip install -r requirements.txt

After testing the new virtual environment, you can remove the old one by deleting the directory containing it:

rm -r ~/Python3.11.3/tutorial

Using system installed Python packages

For many versions of Python, the HPCC has preinstalled Python packages as modules. These are mostly in the form of "bundles", e.g., Python-bundle-PyPI and SciPy-bundle, but there are some standalone packages-as-modules, including scikit-learn and PyTorch.

You can use these packages in addition to those you install yourself by loading them before you activate your virtual environment:

module purge
module load Python/3.11.3 SciPy-bundle/2023.07-gfbf-2023a scikit-learn/1.3.1-gfbf-2023a
source ~/Python3.11.3/tutorial/bin/activate

If you install any additional packages into your virtual environment, make sure to load all modules that were loaded when you installed the packages before activating your environment in the future.

Managing different node types (how to fix "Illegal instruction error")

The HPCC has many different types of nodes you can run your code on. See a listing here.

When a virtual environment is created or a package is installed using pip, it is sometimes customized to the type of node you installed it on. It is not guaranteed that you can then run this package on other node types.

By using dev-amd20 to create virtual environments and install packages (as suggested above), you limit the chance at installing incompatible packages, as it is the oldest cluster and most compatible with other node types. Additionally, using the --copies flag when creating a virtual environment (described above) limits the dependence on the node type where the virtual environment was created.

For more on this topic, see our page on Architecture Specific Compilation.