Warning
TensorFlow requires specific instructions for a fully functional installation. As such, the instructions and recommendations on this page may differ slightly from other pages in ICER's documentation, but have been fully tested as of March 2023. For more general Conda and Python usage, please see our page on Using Conda.
Install TensorFlow using conda
In this tutorial, we will first install Anaconda in our home directory, then install TF in a newly created conda environment, and finally run a few TF commands to verify the installation.
Installing Anaconda in your home directory
A full guide of downloading anaconda and installing it in your home directory is here. Following the guide, below we show a sequence of commands that will download and configure conda in one's home directory on the HPCC (say /mnt/home/user123/
).
1 2 3 4 5 6 |
|
Notes
-
We recommend downloading Anaconda 3 which corresponds to Python 3.
-
In the guide, step 8, it says Anaconda recommends entering "yes". However, we recommend a "No" so as to not modify your
~/.bashrc
. After that, you will need to runsource /mnt/home/user123/anaconda3/bin/activate
andconda init
as shown above. -
The last command above is to disable automatic base environment activation. This is necessary.
-
By default, your anaconda will be installed in
/mnt/home/user123/anaconda3/
. You can specify an alternate installation path during this interactive process. -
Above, the link after
wget
can be replaced by a more recent version of script in https://repo.anaconda.com/archive/ -
If you encounter any errors, check your quota first, by running
quota
. Make sure your home directory has enough space. Always fully delete previously installed anaconda if you are going to re-install by repeating the steps.
Installing TF in a conda environment
After you've successfully installed Anaconda in your home directory, you can follow the commands below to install TF and troubleshoot some errors. After initial login, run ssh dev-amd20-v100
to log into our GPU dev-node.
Warning
Installing TensorFlow while on dev-amd20-v100
will restrict you to amd20 nodes with GPUs. You must specify amd20
as a constraint when
submitting a batch job or starting an OnDemand session.
If you are not familiar with basic conda commands (e.g., conda create/activate/install/deactivate
), check out this conda cheatsheet. After creating a new conda environment (namely tf_gpu_Feb2023
below) and activating it, the environment variable $CONDA_PREFIX
will point to /mnt/home/user123/anaconda3/envs/tf_gpu_Feb2023
.
Minimally, you only need to modify the first line below, that is, export PATH=...
, so that the path points to the bin
folder in your anaconda installation. The rest of the commands can be directly copied and run in your terminal.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
Verifying the installation using simple commands
Now we'll run a few one-liners to test out, right from the shell command line. Again, you need to be logged onto dev-amd20-v100
the GPU dev-node. If no errors pop up when executing these commands, you should be all set.
Note
You'll need to run the first four lines every time you want to start using TensorFlow. This includes any SLURM scripts you write to launch TF jobs.
1 2 3 4 5 6 7 8 9 10 11 12 |
|
More complicated testing code can be found in our TensorFlow model training examples.
Using TensorFlow in an OnDemand Jupyter notebook
If you would like to use TensorFlow from an Open OnDemand Jupyter notebook, you'll first need to install Jupyter.
1 2 3 4 |
|
Then, you need to edit a particular file to set up the LD_LIBRARY_PATH
and XLA_FLAGS
environment variables in the same way they are set above.
First, we'll make a backup of this file as demonstrated below.
1 2 3 |
|
With your favorite text editor, open kernel.json
. Look for the following pattern at the end of the file
1 2 |
|
1 2 3 4 5 6 |
|
Note
If you open a notebook and get a message about no kernel being available, make sure you added the comma after the first curly bracket.
When you request a Jupyter notebook through OnDemand, make sure to do the following:
- Request more than the minimum amount of memory (on the order of GB)
- Select "Launch Jupyter Notebook using the Anaconda installation in my home directory"
- Enter the full path to your Anaconda installation; e.g,
/mnt/home/user123/anaconda3
- Enter the name of your TF Conda environment; e.g.,
tf_gpu_Feb2023
- Select "Advanced Options"
- Set the node type to
amd20
- Request 1-4 GPUs
Even if you have requested less than 4 hours of wall time, your job may spend more time in the queue than you may used to. This is normal given the specific resources we have requested.
You can test that TensorFlow will run in your notebook by running the following:
1 2 3 4 5 6 7 8 |
|