Using Python in HPCC with virtualenv
Python applications usually use packages and modules that require specific versions of libraries. This means one installed application may conflict with another application due to using the same library but with different versions. It is difficult to meet the requirements of every application by one global Python installation. To resolve this issue, users can create an isolated virtual environment with a particular version of Python in a self-contained directory of their home or research space. Any package and the dependent libraries installed inside the directory can be available only through the virtual environment. Different applications can then use different virtual environments to avoid any conflict.
Creating virtual environments
For compatibility
purposes
please create virtual environments and install packages using the dev-amd20
development node:
ssh dev-amd20
To create python virtual environments, please load your preferred version of Python:
module purge
module load Python/3.11.3
module list Python
Currently Loaded Modules Matching: Python
1) Python/3.11.3-GCCcore-12.3.0
It is also a good idea to create a directory labeled with the Python version to store corresponding virtual environments:
mkdir Python3.11.3
cd Python3.11.3
A virtual environment can be created by running python3 -m venv --copies
<DIR>
, where <DIR>
is the directory of the created environment. For example,
the following code creates a virtual environment named tutorial
in the
tutorial
directory:
python -m venv --copies tutorial
ls tutorial
bin include lib lib64 pyvenv.cfg
The created tutorial
environment can now be used by sourcing the script file
activate
under the bin
directory:
source tutorial/bin/activate
Your prompt will begin with the title of the virtual environment, for example:
(tutorial) user@host:~/Python3.11.3$
To leave the environment, run deactivate
:
deactivate
and the virtual environment name will disappear.
Use the --copies
flag to ensure compatibility
If you have used venv
before, you may not have seen the --copies
flag.
It is important to use this when creating a virtual environment on the HPCC so
that you reduce dependence on the system environment when you create the
virtual environment. This allows you to use your virtual environments on
multiple different node
types.
Activating virtual environments
To activate a previously created virtual environment (for example, after you log in, or in a job submission script), first load the corresponding module, then activate the virtual environment:
module purge
module load Python/3.11.3
source ~/Python3.11.3/tutorial/bin/activate
To deactivate the environment, run:
deactivate
Load the corresponding module
Make sure to load the same Python module you used to create the virtual environment. If you forget to load the module, or load a different version, you will likely receive an error like
python: error while loading shared libraries: libpython3.11.so.1.0: cannot open shared object file: No such file or directory
To help with this, we recommend creating virtual environments in a directory labeled with the version of Python used as described above.
Installing packages from PyPI using pip
The most common usage of pip
is to install python packages from the Python
Package
Index
with a requirement
specifier.
Below, some of the common usage scenarios are introduced.
Activate your virtual environment first!
Before running any of the commands below, make sure to activate your virtual environment, otherwise, your packages will be installed in the wrong location.
For compatibility
purposes,
install all packages using the dev-amd20
development node:
ssh dev-amd20
To install the latest version of a python package, run python -m pip install
<Package Name>
, for example, using sympy
as <Package Name>
:
# Activate your virtual environment first if not activated (see above)
python -m pip install sympy
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting sympy
Downloading sympy-1.14.0-py3-none-any.whl (6.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.3/6.3 MB 99.0 MB/s eta 0:00:00
Collecting mpmath<1.4,>=1.1.0
Downloading mpmath-1.3.0-py3-none-any.whl (536 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 kB 211.6 MB/s eta 0:00:00
Installing collected packages: mpmath, sympy
Successfully installed mpmath-1.3.0 sympy-1.14.0
To install a specific version of a python package, run python -m pip install
<Package Name>==<Version Number>
. For example, installing numpy
with
<version number>
as 2.3.2
:
# Activate your virtual environment first if not activated (see above)
python -m pip install numpy==2.3.2
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting numpy==2.3.2
Downloading numpy-2.3.2-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (16.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 16.9/16.9 MB 118.5 MB/s eta 0:00:00
Installing collected packages: numpy
Successfully installed numpy-2.3.2
To upgrade an already installed package to the latest from PyPI, users can run
python -m pip install --upgrade <Package Name>
. For example, to upgrade the
pip
package itself:
# Activate your virtual environment first if not activated (see above)
python -m pip install --upgrade pip
Requirement already satisfied: pip in ./Python3.11.3/tutorial/lib/python3.11/site-packages (22.3.1)
Collecting pip
Downloading pip-25.2-py3-none-any.whl (1.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 85.8 MB/s eta 0:00:00
Installing collected packages: pip
Attempting uninstall: pip
Found existing installation: pip 22.3.1
Uninstalling pip-22.3.1:
Successfully uninstalled pip-22.3.1
Successfully installed pip-25.2
With pip, you can also list all installed packages and their versions with the
command python -m pip freeze
:
# Activate your virtual environment first if not activated (see above)
python -m pip freeze
mpmath==1.3.0
numpy==2.3.2
sympy==1.14.0
For more detail, see the pip docs, which includes a complete Reference Guide.
Reinstalling a virtual environment
To reinstall a virtual environment (e.g., to reinstall on a more compatible
node
type),
first activate the old one, run python -m pip freeze
, save the output to a
file, and deactivate:
module purge
module load Python/3.11.3
source ~/Python3.11.3/tutorial/bin/activate
python -m pip freeze > requirements.txt
deactivate
Then create a new virtual environment following the instructions
above. In this example, we'll call it
new_tutorial
. Activate it, and reinstall
all packages in the requirements.txt
file:
source ~/Python3.11.3/new_tutorial/bin/activate
python -m pip install -r requirements.txt
After testing the new virtual environment, you can remove the old one by deleting the directory containing it:
rm -r ~/Python3.11.3/tutorial
Using system installed Python packages
For many versions of Python, the HPCC has preinstalled Python packages as
modules. These are mostly in the form of "bundles", e.g.,
Python-bundle-PyPI
and
SciPy-bundle
, but there are some
standalone packages-as-modules, including
scikit-learn
and
PyTorch
.
You can use these packages in addition to those you install yourself by loading them before you activate your virtual environment:
module purge
module load Python/3.11.3 SciPy-bundle/2023.07-gfbf-2023a scikit-learn/1.3.1-gfbf-2023a
source ~/Python3.11.3/tutorial/bin/activate
If you install any additional packages into your virtual environment, make sure to load all modules that were loaded when you installed the packages before activating your environment in the future.
Managing different node types (how to fix "Illegal instruction error")
The HPCC has many different types of nodes you can run your code on. See a listing here.
When a virtual environment is created or a package is installed using pip
, it
is sometimes customized to the type of node you installed it on. It is not
guaranteed that you can then run this package on other node types.
By using dev-amd20
to create virtual environments and install packages (as
suggested above), you limit the chance at installing incompatible packages, as
it is the oldest cluster and most compatible with other node types.
Additionally, using the --copies
flag when creating a virtual environment
(described above) limits the dependence on
the node type where the virtual environment was created.
For more on this topic, see our page on Architecture Specific Compilation.